J. Mol. Biol. (1992) 226, 349-366
Sequence-specific DNA Binding by a Two Zinc-finger Peptide from the Drosophila melanogaster Tramtrack Protein Louise Fairall, Stephen D. Harrison? Andrew A. Travers and Daniela Rhodes MRC Laboratory of Molecular Biology Hills Road, Cambridge CB2 Z&H, England (Received 22 November
1991; accepted 18 March
1992)
We show that the DNA-binding domain of the Drosophila melanogaster regulatory protein Tramtrack consists of a 66 amino acid sequence containing two zinc-finger motifs and a short sequence N-terminal to the first finger motif. This short N-terminal sequence is essential for DNA binding and we suggest it is involved in maintaining the threedimensional structure of the first finger domain, as has been seen in the nuclear magnetic resonance structure of one of the zinc-finger domains of the yeast transcription factor SWI5. The characterization of the DNA-binding activity of this 66 residue peptide (A91 lzf) shows that it binds in a sequence-specific manner, as a monomer, to a natural target site with an The shortest A9llzf binding site, which retains full affinity. apparent K, - 4 x IO-‘M. consists of an 11 base-pair sequence with a one nucleotide overhang at each 5’ end. DNase I. hydroxyl radical and methylation protection footprinting studies show that, in common A911zf binds in the major groove of DNA. The data with other zinc-finger proteins, presented are consistent with the zinc-fingers of Tramtrack contacting both strands of the DNA, and thus the binding differs in detail to that observed in the crystal structure of the three zinc-fingers of ZiE.268 complexed t’o their target DNA.
Keywords: zinc-finger;
DNA
recognit,ion;
1. Introduction The zinc-finger DNA-binding motif, first discovered in the Xenopus transcription factor III A (TFTTTA), consists of a sequence motif of about 30 amino acid residues of the sequence: YXCysX,-,CysX3YX,YX2HisX,-,HisX (X stands for any amino acid, ‘I’ for hydrophobic residues), folded around a zinc ion to form an independent) mini domain (Miller et aZ., 1985). Homologous sequence motifs have since been found repeated in different numbers within the sequence of about 200 proteins (G. Jacobs, personal communication), many of which are known to be involved in the regulation of gene expression (Klug & Rhodes, 1987). For a number of DNA-binding transcription factors, TFIIIA (Smith et al., 1984), SPl (Kadonaga et al., 1987), SW15 (Nagai et al., 1988) and Krox 20 (Nardelli et al.. 1991), it has been demonstrated t Present address: Howard Hughes Medical Institute. Univrsitv of California. Berkeley, CA 94720, U.S.A.
Tramtrack;
footprinting;
protein
expression
experimentally that the region of the protein containing the zinc-finger motifs is sufficient for directing sequence-specific DNA recognit’ion. More recently it has also been shown that two zinc-fingers are sufficient for sequence-specific recognition. These are from the human enhancer binding protein MBP-1 (Sakaguchi et aE., 1991) and the yeast transcription factor ADR I (Thukral et al.. 1991). Recently the crystal structure of a complex between the three zinc-finger motifs of the DNAbinding domain of the mouse regulatory protein Zif268 and its target DNA was determined. This gives the first detailed insight into how zinc-fingers recognise DNA in a sequence specific manner (Pavletich & Pabo, 1991). This structure confirms the general architecture of zinc-finger domains, derived earlier from model building (Berg, 1988; Gibson et al., 1988) and nuclear magnetic resonance structure determinations of a number of single zincfinger peptides (Lee et al., 1989; Klevit et al., 1990; Omichinski et al., 1990). The overall structure of the zinc-finger motif consists of a two-stranded P-sheet
349 oOa2-~836/92/140349-18
$0%00/O
0
1992 Academic
Press Limited
L. Pairall packed against an a-helix. A zinc ion is tetrahedrally co-ordinated by the pair of histidine and the pair of cysteine residues: the cysteine ligands are located in the P-sheet and the histidine ligands in the a-helix. In the Zif’268-DNA complex the protein spirals continuously around the major grove of an 11 base-pair DNA fragment and the binding site of each finger spans three base-pair steps. Each finger makes nearly equivalent contacts with the DNA as a consequence of the highly repetitive nature of the sequence of both the protein and DNA in this particular complex. Most of the contacts are made by the a-helices: five out of six base-specific contacts are made by arginine (the 6th is a histidine) with guanine, all located on one strand of the DNA helix. The two linkers between finger domains, which in this protein are of the most common type TGEKP (Pavletich & Pabo, 1991), appear to be passive in that the orientation of adjacent fingers seems to be determined largely by the contacts made by the finger domains to the DNA. Although this structure provides a clear picture as to how zinc-fingers are used in binding to DNA and in detail how recognition of guanine bases and phosphate groups takes place. it does not provide an explanation as to how zinc-finger domains are used to recognise sequences of a different character. Furthermore, the comprehensive footprinting data on TFTIIA (Fairall et al.. 1986; Churchill et al., 1990), which indicate that some of the linkers between fingers must cross the minor groove, cannot in a simple wav be reconciled with the arrangement of finger domains seen in the Zif268-DNA complex. Together. these observations indicate that other zinc-finger proteins may be arranged differently on DNA. Consequently, t)o underst’and the full repertoire of sequence-specific recognition by this ubiquitous DNA-binding module it is necessary to study the structure of a number of other zinc-finger-DNA complexes. For any one particular transcript,ion factor the length and sequence of its specific DNA binding site is likely to be determined by the number of zinc-fingers present, t,heir amino acid sequence and also their disposition on the DNA. In addition, the stretch of sequence linking finger domains in different proteins varies in both length and sequence and may be related to the mode of binding. We have studied the Tramtrack (Ttk) protein from Drosophila melanogaster in order to explore further how zinc-finger motifs are used in sequence specific recognition. Ttk is a sequence-specific DNAbinding protein (6%5 kDa), which contains two adjacent zinc-finger motifs. It is involved in the regulation of transcription of the pair-rule gene &hi tarazu @z) (Harrison & Travers, 1990; Brown et al.. 1991). Upstream from the transcription startpoint of ftz there are three regions involved in the control of its transcription, the “zebra element”, the “neurogenic element” and the “upstream element”, which also acts as an enhancer element (Hiromi et al., 1985). Ttk binds to at least four sites within this region, one site is contained within the upstream element and the other three in the zebra element
et al. (Harrison & Travers, 1990). The sequence of the Ttk binding sites is not heavily conserved (see Discussion). The binding site located in the upstream element was used to isolate a llgtl 1 clone encoding the C-terminal moiety of the protein (Harrison & Travers, 1990), while a major zebra element’ site was used to purify by affinity chromatography t,he protein Ftz F2, subsequently shown t,o be identical to Ttk (Brown et al.. 1991). The spatial and temporal expression of ttk RNA is approximately complementary to that of ftz. Consequently it has been proposed that ttk protein acts as a repressor preventing ectopic expression of ftz during embryogenesis (Harrison & Travers. 1990: Brown et nl., 1991) although the abundance of both ttk RNA and protein indicate that this may not be its sole function. The DNA binding domain of Ttk differs from that of Zif268 in a number of ways and consequently Ttk is a good candidate for studying the role and versatility of zinc-fingers in DNA recognition. First. Ttk is unusual as it contains only two zinc-finger motifs, whereas many other proteins contain three or more. Second, in Zif268 all the fingers have three amino acids separating the conserved histidine residues that co-ordinate zinc, whereas the two fingers in Ttk have four amino acids separating the conserved histidine residues. A conseyuence of four instead of three amino acids between the conserved histidine residues is that the a-helix would be disrupted ot distorted. This is exemplified by the two-dimen sional nuclear magnetic resonance structure of a zinc-finger from the human rnale-associated protein ZFY in which there are also four amino acids between the conserved histidine residues (Kochoyan et al.. 1991). This might affect the disposition of the protein on the DNA. Tn the Zif26268 structure the linkers (TGE/&KP) between the fingers are extended a.nd do not make any interact.ions with the DNA. In Tt,k the linker sequence (KRNVKV) is longer and suficiently basic: that it seetns plausible that it could make specific or non-specific. contacts with t,he DNA. This linker sequence shows some similarity to the consensus sequence for linkers proposed to cross the minor groove (Kochoyan et al., 1991). DNase T footprinting data of t.he fulllength Ttk protein had delimited a binding site of about 20 base-pairs of the sequence TAATAACGA TAACGTCCA (Harrison $ Travers. 1990). This DNA sequence lacks the strand asymmetric distribution of guanine residues of the Zif268 binding site. t,hus the details of base-specific recognition must differ. We have expressed in bscherichia coli a number of differentsized deletions of ttk and investigated t,heir DNA-binding properties. We show here that’ a short peptide of 66 amino acids including the two zinc:finger motifs, is sufficient to direct t.he sequencespecific binding of Ttk. This peptide contains seven amino acids N-terminal to t)he first finger motif. Deletion of t’hese residues results in a loss of DNA binding activity and thus suggests that’ they are an essential part of the DNA binding domain. This 66-
DNA-binding
by a Two Zinc-jinger
residue DNA-binding domain, which was purified as an unfolded peptide, was refolded in the presence of zinc to give full DNA-binding activity. The DNA-binding site was defined using several footprinting reagents and by measuring dissociation constants to oligonucleotides of different length.
2. Materials and Methods (a) Expression
of different
deletions
of tramtrack
The 3 Tramtrack fragment.s Rlf3, A911 and A912 (Fig. 1) were cloned as EcoRI-EcoRI fragments into the phage T7 Rl\‘A polymerase expression vector pGM484. pGM484 is a descendant of pRK172 (McLeod et al., 1987) with the SspI fragment containing the Ml3 origin of replication from pBluescript KS- cloned into the SspT site ((i. Micklem. personal communication). The presence of the Ml3 origin of replication permits both direct (single-st,randed) sequencing and mutagenesis of the cloned gene. To facilitate the cloning of the EcoRI-EcoRI fragments in t,he correct reading frame an oligonucleotide containing an EcoRI restriction site was introduced into the ,VdeT site of pGM484. R113 was cloned from the Agtll clone. isolat,ecl by Harrison & Travers (1990), containing the (:-terminal moiety of the ttk coding sequence. A91 1 and A912 were generated by site-directed mutagenesis in M 13. A91 lzf and A912zf were then created by inserting a stop codon immediately after the second zinc-finger motif by site-directed mutagenesis on sequence pGM484+A911 or 6912. To aid the reproducibility of expression all the fragments were then recloned as NdeTPstl fragments into pETlla and sequenced. The full length protein was cloned as a completely native sequence by mutating the first, methionine to give a NdeI site. The gene (Harrison & Travers, 1990) was first cut with HaeII. treated with mung bean nuclease to create a blunt end and then rut wit,h NdeI. This fragment was then cloned into pETI (with t.he .Ml3 origin of replication) by first cutting the vector with EcoRI, filling in the end with Klenow and then cutting with NdeI. The pETl1 series of vectors contain the lncl gene encoding the lac repressor and a corresponding operator site adjacent to the phage T7 promoter. This keeps the genes completely repressed until induction with IPTGt (Studier et al.. 1990). The various ttk constructs were expressed in E. coli by transformation of E. coli strain BL21(DE3) with the appropriate plasmid. Colonies were then grown at 37°C in 2 x TY. 100 pg ampicillin/ml until A,,, = @6, and t,hen protein synthesis was induced by the addition of IPTG to 0.4 mM (Studier et al.. 1990). The induced cultures were grown for a further 2 h and harvested by centrifugation at 1OOOg for 10 min. (Tell pellets were st.ored at, -20°C. (h) Analysis
of the DKA-bindiny activity d&tion,s of tramtrack
of the different
The DI\;A-binding activity of the different length ttk deletion products was analysed using the crude soluble fraction of the cell lysates. Cells from 100 ml of culture. containing expressed proteins, were resuspended in 10 ml of %Omv-Mes (pH6.5). 10% (v/v) glycerol,
t Abbreviations used: TPTG. isopropyl-l-thio-P,Dgalactoside: bp, base-pair(s); DTT, dithiothreitol, PMSF, phen~lmethgIsuIphony1 fluoride; BSA, bovine serum albumin.
351
Peptide
1 mlcl-benzamidine and sonicated. Insoluble material was removed by centrifugation. DNA-binding activities were measured using a radioactively labelled 100 bp HindIII-EcoRI fragment of the number 16 binding site from the ftz promoter upstream element (Harrison & Travers, 1990) in a binding buffer containing 20 miw-Mes (pH 6.5), 1 mM-benzamidine, 0.1 y0 (v/v) h’P40 and 100 pg poly[d(I-C)]/ml. Then 1~1 of crude lysate was added t’o DKA at a concentration of 1 x lo-* M in a volume of 20 ~1. incubated at room temperature and analysed by electrophoresis at room temperature in a 6% (w/v) polyacrylamide gel (29 : 1. acrylamidelbis-acrylamide) buffered with 45mln-Tris-borate (pH85) and visualized by autoradiography. (c) Puri&ation
of the DNA-binding
domain
A91 1zf
As the 66 amino acid residue protein (A91 lzf), was only partially soluble the strategy for protein purification was to extract and purify the insoluble protein in a denatured state and then to renature the protein in the presence of zinc before a final column purification. Wet cell paste (12 g) from 3 1 of culture was resuspended in 15 ml of 20 mM-Tris. HCl (pH 8), 25% (w/v) sucrose 2 mM-MgC1,. Lysozyme (5mg) was added and the mixture incubated for 15 min at room temperature. Then I.5 ml of buffer containing 4 mm-EDTA, @2 M-NaCl, 1 “/b (w/v) deoxycholic acid, 1 o/o NP40 and 20 mM-Tris. HCI (pH 7.5) was added and incubated for a further 15 min at room temperature. The chromosomal DBA was digested by addition of MgCl, and DPiase I to 10 mM and 25pg/ml. respectively, and incubation at room temperature until the lysate was no longer viscous. The lysate was then centrifuged at 5000 g for 20 min. The A91 1zf protein was in the pellet. The pellet was washed twice by resuspension in 50ml of 0.5% (v/v) Triton X-100, 1 mM-EDTA and then solubilized in 20ml of 5 M-guanidine.HCl. 1 mM-DTT. 0.5 mM-PMSF. Insoluble matrria.1 was removed by centrifugation at 8000g for IO min. The supernatant was then dialysed overnight against 2 I of 8 M-Wea, I mM-DTT, @5 mM-PMSF at room temperature. The supernatant was fractionated by chromatography on a 20 ml DEAE-Sephacel column (Pharmacia) equilibrated in 8 M-urea, 1 mM-DTT, @5 mM-PMSF. This step removes nucleic acid and the A91 lzf protein is in the flowthrough. The flow-through and wash were collected. diluted to 200 ml in the 8 M-urea buffer, dialysed against one change of 4 M-urea, @5 mM-PMSF. @5 mM-DTT. 20 mM-Mes (pH 65), 100 PM-Zinc acetate and then against 2 changes of buffer A (20m~-Mes (pH6.5). 1 rnM-sax,. 0% m&I-PMSF. @5 mM-DTT, IOOpm-zinc acetate, 10% glycerol). The protein was then bound to 20 ml of CMSepharose equilibrated in buffer A using a batch procedure in siliconized 50 ~1 Falcon polypropylene conical tubes. After centrifugation at 1OOOg the resin was washed in 100 ml of buffer A, 0.2 M-NaCl and then the protein was eluted in 30 ml of buffer A. 0.8 M-Ed. (d) Preparation
of binding
site DNA
Oligonucleotides were synthesized on an Applied Biosystems 380B DiYA synthesizer. The single strands were purified on 8% or 20% (w/v) denaturing polyacrylamide gels for the long and short oligonucleotides, respectively. After elution from gels the oligonurleotides were further purified on C-18 Sep-Pak cartridges (Waters). The molarity of each oligonucleotide was calculated from the molar absorptivities (5260) of the 4 nucleotides (Sproat & Gait, 1984). The 2 strands were mixed in equimolar
L. Fairall
352
amounts and annealed in 20 m&r-Tris . HCl (pH 7.4), 100 miw-NaCl, 1 m&r-EDTA by heating at 80°C and allowing to cool slowly. Then the oligonucleotides were dialysed against deionized water to remove the EDTA, which inhibits binding of zinc-finger proteins to DNA. (e) Analysis
of the DNA-binding
activity
of
A.9117.f
The percentage of protein able to bind to DNA, or active protein concentration, was estimated by titration of a 21 bp DNA-binding site, at a concentration above the estimated binding constant, with increasing amounts of protein (Riggs et al., 1970). Binding reactions were carried out in siliconized Eppendorf tubes and the binding buffer contained 20 miw-Mes (pH 65), 1 m&r-MgCl,, 10% glycerol. After incubation for 10 min at room temperature samples were made 4 yc glycerol, 4 mr\l-Tris . HCl (pH 7.4), 902% (w/v) bromophenol blue and analysed by electrophoresis in 67% (w/v) agarose (BRL electrophoresis grade) gels. These gels were buffered in 45 mrvr-Tris-borate (pH 8.5) and electrophoresis carried out at 30 mA at room temperature. To visualize the amount of DNA bound to protein, gels were stained with ethidium bromide (1 pg/ml) for about 30 min (Miller et al., 1989). At a ratio of protein/DNA ratio of 025: 1 the amount of proteinDNA complex formed is insensitive to the binding constant. We have therefore densitometer-traced the negative of the photograph of the gel (Fig. 3b)) and from this calculated the percentage of complex formed in each lane. From this the concentration of protein, which gives 25% complex formation, was calculated and thus the active protein concentration. To estimate the fraction of DNA bound and thus the fraction of active protein, the gels were stained for protein with 0.1 c,& (w/v) PAGE blue 83 (BDH) in 45% methanol. 10% acetic acid and destained in 5 c/o methanol, 7 T$, acetic acid. (f) DNase
Z footprinting
In order to study the binding site of the deletion mutant A911zf of Ttk, the following DNA fragment containing a natural target site (derived from the ftz promoter upstream element) was used in DNase I footprinting studies: $TCCGACATCAACATCTAATAAGGAGGCTGTAGTTGTAGATTATTCCTTAACGTCCATTAACAATGATCGGTT ATTGCAGGTAATTGTTACTAGCCAAG.
-3’
This DNA fragment was also used to estimate the binding constant using the DNase I footprinting method (Johnson et al., 1979). The 3’ end of each strand was labelled using reverse transcriptase and [a-“P]dCTP or [a-32P]dATP for the top strand or bottom strand, respectively. This DNA fragment was used at a concentration of 1 x lo-* M and the concentration of active protein varied from 9x 10d7 M to 7 x 10-9~. The binding buffer was 20 mivr-Mes (pH 65), 1OOpg poly[d(I-C)]/ml, 0.1 y0 NP40, IOOpg BSA/ml, 2 m&r-MgCl, 40 mM-NaCl. DNA and protein-DNA complexes were incubated with DNase I (Zpg/ml) for 1 and 2 min at room temperature and digestions stopped by the addition of EDTA to 4 mM. The samples were extracted with phenol/chloroform and the aqueous phase made 50% formamide, 5m1~-NaOH, @05c/0 (w/v) bromophenol blue and 605% (w/v) xylene cyanol. Analysis was carried out in 20% denaturing polyacrylamide gels. To avoid the loss of short DNA frag-
et al. ments during gel drying, the gels were wrapped in Saran wrap and autoradiographed wet at -70°C. (g) Hydroxyl
radical
footprinting
The 49 bp DNA fragment containing a natural target site (shown above) was also used in the hydroxyl radical footprinting studies. The 32P-labelled DNA binding site was used at a concentration of 1 x lo-* M and the protein (A91 lzf) concentration was varied from 65 to 5 x 1Om6M. The binding buffer contained 20m~-Mes (pH 6.5), 40 mM-NaCl, 2 mM-MgCl,, 0.1% NP40, 1OOpg BSA/ml and 10 pg sonicated calf thymus DNA/ml. To cleave the DNA in the control and protein-DNA complex 2~1 each of 1 mM-Fe(II)-2 m&r-EDTA, 903% H,O,, 10 m&r-sodium ascorbate were added to 20~1 reaction mixes (Tullius et al., 1987). The hydroxyl radical was quenched almost immediately so there was no need to stop the reaction. Samples were then extracted with phenol/chloroform and the aqueous phase made 50% formamide, 5m~-NaOH. @05% bromophenol blue, 005% xylene cyanol. 1 m&r-EDTA. Subsequent analysis was carried out as above for the DNase I footprinting described experiments. (h) Quantitive
analysis of the footprinting
data
The data from the DNase I and the hydroxyl radical footprinting studies were treated in the same manner. The method involves subtracting the cleavage pattern of the naked DNA from that of the protein-DNA complex so that the binding site of the protein can be visualized more clearly (Rhodes, 1989). Digital images were produced by densitometry of the autoradiographs of the gels using a laser densitometer custom built in this laboratory. A profile of each lane was obtained, and the area under each using the computer programme peak measured GELTRAK (Smith & Thomas, 1990). For any bonds with no detectable cleavage a value that equalled a third of the smallest measurable peak was assigned for the area. The probability of cleavage at each bond was calculated using the equation:
i.e. by dividing the integrated area (A,,) of a band n by the sum of the integrated area (l4,) of uncut DNA plus all the fragments (bands) longer than, and including, band n (Lutter, 1978). In these experiments the difference probability was calculated for each base by subtracting the natural logarithm of the probability of cutting for the naked DNA from the natural logarithm of the probability of cutting for the DNA-Ttk complex. For the hydroxyl radical footprinting experiments, each lane was scanned 4 times, and the average area for each peak was used in t,he calculations. The difference probability was then plotted against the sequence. In each case the bond cut is to the 3’ side of the base numbered. (i) Methylation by dimethylsulphute Methylation by dimethylsulphate (Maxam & Gilbert, 1980) of the naked DNA and the Ttk-DNA complex was carried out, in a binding buffer containing 20m~-Mes (pH 6.5), 40 mM-NaCl, 2 mM-MgCl,, 0.1 y0 NP40, 100 pg BSA/ml, 1Opg sonicated calf thymus DNA/ml, as done for the TFIIIA-DNA complex (Fairall et al., 1986). The DNA was labelled with 32P at the 5’ ends of the EcoRIHind111 fragment of the number 16 binding site from the ,ftz promoter upstream element (Harrison & Travers.
DNA-binding
by a Two Zinc-jinger
1990). In the binding reaction the DNA concenwas tration was 1 x 10-s M and the protein concentration 1 x 10F6 M. A final concentration of dimethylsulphate of @25O/, (v/v) was used and the reaction carried out at 20°C for up to 6 min. Volumes of 18 ~1 were taken at various times and added to 10~1 of “stop solution”, which was the same as for Maxam & Gilbert sequencing except that it contained 15 mM-EDTA. In order to avoid depurination 10 mM-EDTA was also included in all the subsequent buffers and the 63 M sodium acetate also contained 100 pg sonicated DPu’A/ml. Cleavage at the methylated base was carried out in 10 ~1 of loo/ (v/v) piperidine at 90°C for 30 min. The piperidinr was removed in a speed va( (Savant). Analysis was carried out in 8% denaturing polyacrylamide gels which were then dried and autoradiography was carried out at -70°C.
(j) Measurements of binding constants using band-shift gels The 5’ end of each oligonucleotide was radioactively labelled using [y-32P]ATP and polynucleotide kinase before annealing with the complementary strand. In these experiments the binding site was at a concentration of 1 x lo-’ YNand the A911zf concentration was varied. The binding buffer contained 10 pg poly]d(I-C)]/ml, 100 mg NP40. 2 m&r-MgCl,, 40 mlvr-h’aCl_ 0.1 “/b BRA/ml. 20m~-Mes (pH 6.5), 10% glycerol. The samples were analysed in 0.7 T/o agarose gels as described, but were dried at 60°C on DE81 paper (Whatman) and autoradiography rarried out. Dissociation constants (K,) were calculated using the following equation (Riggs et al., 1970; Johnson et al.. 1979):
K _ [mm D [PD] where
f’ = A91 lzf. I1 = DNA-binding
site
when [D] << K,,
Peptide
363
were cloned, expressed in E. coli and their DNAbinding activity measured. The deletions of ttk shown in Figure 1 were expressed in the T7 polymerase promoter system of Studier et al. (1999) from the plasmid pETlla. The Ttk clones R113, A911, 8912, A911zf and A912zf were all expressed to high levels in E. coli (Figs 1 and Z(a)), but t,he full length protein appears to be expressed poorly. The ability of the various ttk deletions to bind to DNA in a sequence-specific manner was tested using band-shift assays. The proteins were assayed for binding using a radioactively labelled 11lindIIII EcoRI restriction fragment containing the natural target site (the number 16 binding site) of the ftz upstream element (Harrison & Travers, 1990). This assay shows that R113, A91 1 and A91 1zf are able to bind to DNA, whereas A912 and A912zf do not (Fig. 2(b)). Consequently, the shortest peptide that can bind to the Ttk binding site is A91 lzf. We show below that the binding of A91 lzf is sequence-specific as determined by competition experiments, DNase 1 and hydroxyl radical footprinting studies and methylation protection studies of the protein-l>NA complex. In Figure 1 it can be seen that the A91 1 peptide contains seven amino acids additional to the A912 peptide and are located N-terminal t’o the first of t’he two zinc-finger motifs. Our results show that’ at least some of the amino acids within this region are required for sequence-specific DNA-binding and that the two zinc-finger motifs alone are insufficient to bind t,o DNA in a sequence-specific manner. Similarily it has also been shown that t’he yeast transcription factor ADRI requires amino acids N-terminal to the first finger domain for l)KA binding activity (Thukral et al., 1991).
then IPI free = Ipltotd~
(ii)
SO
K, = [PI,,,,, x JJY
[P-W
i.e. ]P] when No/;
of the DPUA is complexed.
(k) Competition experiments Competition experiments with poly[d(I-C)] were performed using the complex of the 21 bp binding site fragment and A911zf at a complex concentration of 1 x 10F6 M. The protein was added to a mixture of specific and non-specific DNA. In these competition experiments the final concentration of polv[d(I-C)] was varied between 50 and 1000 pg/ml, or 500 pg)ml of the 21 bp binding site. and incubated at room temperature for 30 min. Analysis was performed in agarose gels as described above.
3. Results (a) C’haracterization and puri$cation of the DNA -b&ding domain of tramtrack (i) ldentijication
of the D&A-binding
domain
In order to identify the DNA-binding domain of the Ttk protein several different regions of the protein. each containing the two zinc-finger motifs,
Ttk binds to DNA
as a moaomer
In order to establish whether Ttk protein binds as a monomer or a dimer, the three different length Ttk deletion mutants that are able to bind to DNA were mixed, incubated with binding-site DNA and the complexes formed analysed on a band-shift. gel (Hope & Struhl, 1987). If the protein were to bind as a dimer, six bands representing protein-DKA complexes of different sizes would be expected, corresponding to DNA bound by three types of homodimer and three types of heterodimer. In the experiment shown in Figure 2(b). lanes 7 and 8 contain R113/A91 l/A91 lzf in the molar ratios of 1 : 1 : 1 and 3 : 1 : 1, respectively. So bands corresponding to bound heterodimers are observed, suggesting that the Ttk DNA-binding domain binds to DNA as a monomer. (iii)
Expression
and puri$cation
Having identified required for DNA peptide A91 1zf. was fied on a large scale. soluble, most of the insoluble fraction of
of A91
lqf
the smallest region of Ttk binding, the corresponding expressed in E. coli and puriThe protein was only slightly expressed protein being in the the E. coli. Tn order to obtain
354
L. Fairall
RI13 mef
4
191cia
)C
FY
et al.
C H 6 !I+
H
CH ;;I
A911 mefTKEGEHlYbt
82aa
b
coo c-j
CH 6k -
’
coo (-)
A912 coo (-)
A911zf
A9 12zf
Agllzf Figure
1. Deletions
RN"K~YPBPFOFKE~TRKDN~~~~~K~~~KI of ttk expressed
in E. coli. The amino
acids shown
arise from the expression vector. The sequence of A91 Izf containing
t,he large amounts of protein required for structural work, the protein was purified from t,he inclusion bodies. The protein was solubilized from the inclusion bodies using 5 M-guanidine. HCl and then dialysed into 8~-urea to enable contaminating nucleic acid to bind to DEAF+Sephacel. After this column fractionation step, the unfolded A91 lzf peptide was refolded by dialysis into a buffer containing zinc ions. The protein was then purified further using a bat,ch method on CM-Sepharose. At this stage of the purification the protein was estimated to be 80% pure by densitometry of a SDS/ polyacrylamide gel (Fig. 3(a)). As attempts at further purification resulted in either unacceptable loss of prot’ein or loss of DNA-binding activity. the protein was not purified further. (iv)
DNA-binding
activity
of A911zf
Structural analyses by nuclear magnetic resonance or crystallographic methods require a protein to have one major conformation, and furthermore it’ is clearly important to demonstrate that the conformation is active. Tn order to be able to estimate the fraction of purified DNA-binding domain that is active, that is, able t’o bind to DNA sequence-specifically, the protein concentration has to be estimated accurately. This was done in two steps. First, the approximate protein concentration was estimated
in lower case letters
are not the Ttk sequence
hut
the 2 zincs-finger motifs is shown at t,he bottom.
using turbidity measurements (Layne. 1957). Then, the DNA-binding activity of the protein was measured by titrating a known concentration of 21 bp DNA-binding site with different amounts of A911zf followed by analysis in band-shift gels (Fig. 3(b)) (Riggs et al., 1970). Tn these experiments, performed in the absence of non-specific DNA, aggregation occurs at’ high protein/DNA ratios. The short, length of DNA was chosen t’o maximize the difference in migration between the naked DNA and the protein-DNA complex, and hence facilitate measurements of complex formation. This 21 bp fragment’ contains the full binding site, as ident#ified from DNase I footprints. and binds to the protein with approximately the same affinity as a 49 bp fragment containing the Ttk binding site (see below). Estimates of protein concentration by both scattering measurements and absorbance readings at 258 and 275nm (taking into account the aromatic residues: Yanari & Rovey. 1960) indicated that, about 5Oy/,, of t,he protein was able to bind specifically to its DNA-binding site. Recause these two methods may give inaccurate estimates of pro tein concentration we used a t’hird method that permits the bound and free protein to be visualized directly: a band-shift gel was split into two wit’h bot,h halves containing both complex and free protein, then one half was stained with ethidium
DNA-binding
1
by a Two Zinc-jkger
2345678
(a I
Peptide
355
bromide to visualize the DNA and the other half was stained with PAGE blue 83 to visualize the protein (Fig. 3(c)). The free prot’ein is retained at the top of the gel (Fig. 3(c), lane4). In the lane containing apparently 100 To complex (Fig. 3(c), lanes 1 and 3) which, according to the scattering measurements contains 2 mol of protein to 1 mol of binding site DNA, no free protein can be seen. There is some staining of a smear from the well to the complex band. This is presumably due to some dissociation of the complex upon electrophoresis. This analysis suggests that most of the protein is bound to the DNA. We conclude that the A91 lzf two finger peptide purified by the protocol described above has more than 50% and probably close to full DNA-binding activity (also see Materials and Methods). (In) .4nalysis
qf the
DNA-binding
site for A91 JFf
We have shown above that the 66 residue peptide A91 Izf, which consists of two zinc-finger motifs and seven amino acids N-terminal to the first zinc-finger motif. is sufficient’ for the sequence-specific binding of Ttk to its DNA-binding site. 1)Nase J footprint’ing, hydroxyl radical footprinting, mrthylation protection and dissociation constant measurements to a variety of short oligonucleotides were then used to charact,erize the DNA-binding site and gain an insight into how A91 lzf interacts with DNA.
12345678
(i) DNase
(b) Figure 2. Expression levels and DNA binding activit.y of the deletions of ttk. (a) The expression levels of the various deletions of ttk. Lane I contains molecular weight markers of size 12.3. 17.2, 30. 42.7, 6625 and 7X x 103. crude cell lysates from the Laws 2 and 3 contain expression of 2 trials of the whole protein. Lanes 4, 5. 6, 7 and 8 contain crude lysates from the expression of Rl13. A91 1. A912. A91 lzf and A912zf. respectively. Samples were analysed in a HDS/lO~~, to 25 “/b polyacrylamide gel. (b) DXiA-binding act.ivitv of the expressed deletions of ttk. Tn all t,hr binding react,Ions the DNA concentration was 1 x lo-’ M. The DKA was incubated with portions of crude cell Iysatr containing various deletions of ttk: lane 1. R113: lanr 2. 6911; lane 3. A912; lane 4, A911zf; lane 5, A912zf; lane 6, no protein; lane 7, R113/A9ll:A911zf in the ratio I : 1 : 1; lane 8, R113/891 l/A91 lzf in the ratio 3 : 1 : 1. Samples were anal>rsed in a Soi, non-denaturing polyac~rylamidr gel.
I footprinting
studies
DNase T digestion studies of A91 lzf bound to a 49 bp DNA fragment (from the ftz upstream element) were used to obtain information about the binding site of the protein (see below). Jn addition, the DNase I footprinting assay was used to estimate the approximate dissociation constant. When the DNA concentration is well below the K, value. the concentration of protein required to obtain 500/b complex approximates to the K, (Johnson et al., 1979). Incubations with different concentrations of A91 lzf show that a protein concentration of 4.5 x 1OV’ 111is required to achieve apparently full protection of the TINA. Xo detectable protec&tion was observed at a protein cwncentration of 2.25 x 10 ’ M. This large effect on the binding of’ the protein with a small increase in protein concentration would seem to indicate co-operative binding. However, given that the band-shift experiments in Figure l(b) are consistent with the protein binding as a monomer, we believe that the all-or-none effect must arise from complications due to competition for binding by DNase J in the footprinting assay. Despite these complications these studies nevertheless show that the apparent K, value is less than 4.5 x lo-’ M. This value is close to the value of 4 x lo-’ M obtained from bandshift experiments (see below). The value is also similar to the dissociation constant measured for a two finger peptide from the enhancer human binding protein MHP-I of I.4 X IO-’ M (Rakaguchi et al., 1991). I)Nasr 1 footprinting studies show that. the
L. Fairall
356
(a) 12
3 4 5 6 7 8 9 1011
1 b)
et al. isolated DNA-binding domain A911zf protects 21 bp of DNA from cleavage by DNaseT (Fig. 4(a)) as does the full length protein (Harrison & Travers, 1990). On the top strand there is a prominent cleavage slightly displaced from the centre of the protected region between base-pairs 26 and 27. This cleavage is also observed in footprints of the full length Ttk protein (Harrison & Travers, 1990; Brown et al., 1991), indicating that the binding of A91 lzf and Ttk are the same i? this respect. In order to see the binding site more clearly, the DNase 1 cutting pattern of the naked DNA has been subtracted from that of the protein-DNA complex. For each strand the probability of cleavage of each bond in the naked DNA has been subtracted from the probability of cleavage of the corresponding bond in the Ttk-DNA complex, and the difference probabilities plotted against’ the sequence of the DNA (Fig. 4(b)). The pattern of protection on both t,he top and bottom strands is bipartite. The majol area of protection is located on the top strand, from nucleotides 15 to 24, and there is weaker protection on the top strand nucleotides 27 to 31 and the bot’tom strand nucleotides 16 to 21 and 24 t,o 30. The bipartite nature arises because on both the top and bottom strands there is a bond that is cleaved relative to the protection around it: the bond 3’ of nucleotide 26 on the top strand and the bond 3’ of nucleotide 22 on the bottom strand. As DNase T is a large probe we have made use of the known contacts made by DNase I wit’h its cleavage site, as determined from the co-crystal structure of DNase T with DNA (Suck et al., 1988). to deduce the region of the DNA within the footprint contacted by Ttk. We assume that the contacts required for DNase I cleavage are t,ransiently available to the enzyme. In Figure 7 the sites of cleavage and the phosphate contacts required to
Figure 3. Purification and DNA binding activity of A91 lzf. (a) The various purification steps of A91 lzf. J,ane 1 contains molecular weight markers of 12.3. 17.2, SO. 427. 66.25 and 78 x 103. Lane 2 contains the yrot,ein af%rr dialysis int)o urea. Lane 3 rontains the flow-through from the DEAICSephacel column. IJane 4 contains the protein after dialgsis into buffer A. Lane 5 contains A91 1zf after purification on CM-Sepharose. Analysis was carried out in a SDS 1Sq; polpacrylamide gel. (b) Estimation of the active A9llzf concentration. In each I50 ~1 binding reaction the DNA concentration was 7.S x 1W6 M and in addition lanes 2 to 11 contained from 2 to 20 ~1. of Ttk. increased in 2 ~1 steps. The negative of this gel was desitomet,er traced, the amount of complex formed in each lane calculated then the amount of protein required for 25”/, complex formation calculated, and thus the stock concentration of protein was estimated to be I4 x 1W4 Y. (c) Estimat,ion of the fraction of actjive A91 lzf. Lanes I and Fl show 20 ~1 of a 1 : 1 complex at a concentration of 7.3 x 1W6 M and lanes 2 and 4 show the same amount of protein as used in the binding experiment shown in lanes 1 and 3. One half of the gel (lanes I and 2) w-as stained with ethidium bromide and the other half (lanes 3 and 4) was stained with PAGE blur XS.
0 G I 2 3 4 5 6 7 8 9 loll
12 13 14 I5 I6 I7 I8
0 G I 2 3 4 5 6 7 8 9 1011 I2 13 1415 I6 17 I8
Strand A
I 3 F +
Strand B
CATCPACATCTAATAAGGATAACGTCCATTAACAATGA 20 IO 30 GTAGTTGTAGATTATTCCTATTGCAGGTAATTGT;ACT
-5-J (b) of A91 lzf bound to a 49 bp binding site. (a) Autoradiograph of the DNase I protection patterns at different concentrations of protein. Strand A, the top strand; Strand B, the bottom strand. The DKA concentration in each reaction was 1 x 10-s M, and for each protein concentration there are 2 DNase 1 digestion time points (1 and 2 min) shown. The protein concentrations in the binding reactions were as follows: lanes 1 and 2, 0; lanes 3 and 4,9 x 10-’ M; lanes 5 and 6,45 x lo-’ M; lanes 7 and f&2.25 x lo-’ M; lanes 9 and 10, 1.12 x lo-’ M; lanes 11 and 12, 5.6 x lo-’ M; lanes 13 and 14, 2.8 x lo-* M; lanes 15 and 16, 1.4 x lo-* M; lanes 17 and 18, 7 x lo-’ M. The lanes marked C are marker tracks showing the location of guanine residues in the sequence and are produced using G-track reactions from Maxam & Gilbert sequencing. The analysis was carried out in 20% denaturing polyacrylamide gels and the gels were exposed wet at -70°C. (b) DNase I difference probability plot for the ASllzf-DNA complex. The plot was calculated using lanes 2 and 4 for both the top and bottom strands. In each case the bond cut is to the 3’ of the base numbered. Pu’egative values show bonds that are protected and values close to zero indicate cleavage rates close to those of the naked DNA.
Figure 4. DNase I footprint
L. Fairall
358
0612345
0612345
et al. make those cleavages have been plotted on a cylindrical projection of a DNA double helix. Using all of this information we can produce a more accurate picture of the actual area of DNA bound by the protein. When the phosphate contacts required by DNase I to cleave at the borders of the footprint are taken into account, it can be deduced that the region apparently occupied by A91 lzf is nucleotides 18 to 24 and 29 to 31 on the top strand and nucleotides 16 to 19 and 25 to 29 on t,hr bottom strand (Fig. 7). (ii) Hydroxyl
Strand A
Strand B
(4
-1.6 -1.4 -1.2 -0.8 -0.6 -0.4 2 5 -0.2 c
oI
I;
z 9
I
I
1
CATCRACATCTAATAAGGATAACGTCCATTAACAATGA
OS
strand.
5 -o-2-0.4 -0.6 -0.8 -1.0 -1.2 -1.4 -1.6 (b) Figure 5. Hydroxyl radical footprinting experiments of bound to the 49 bp binding site. (a) Hydroxyl radical protection pat,tern at different A91 lzf concentrations. Strand A, the top strand; Strand R. the bottom
AQllzf
radical
footprinting
studies
Tn contrast to DNase 1, the hydroxyl radical is a very small probe and cleaves DNA at every position of the DNA backbone and hence permits the protection of the DNA by the protein to be seen in more detail. Hydroxyl radical footprinting experiments were performed using different protein concentrat,ions, but at higher protein concentrations there was less cutting of the DNA due to quenching of the hydroxyl radical by the protein. The data from the footprinting st,udies using the hydroxyl radical (Fig. 5(a)) were treated in the same way as those from DNase I. The difference probability plot in Figure 5(b) shows that the region of DNA protected by A9llzf from hydroxyl radical cleavage falls within the DBase T footprint and has similar characteristics. The region of protection is from base-pairs 15 to 30. Within this region t,here are two strongly protected areas: the bonds between nucleotides 18 t,o 24 on the t’op strand, which is a split maximum of protection, and the bonds between nucleotides 26 to 29 on t’he bottom strand. These areas of protection are plotted as darkly shaded areas in Figure 7. The area of protection on t’he bottom strand lies directly across the major groove from the area of protection on the top strand. The weaker areas of protection, the bonds bet’ween nucleotides 28 to 30 on the top strand and the bonds between nucleotides 14 to 20 on the bottom strand. lie across the minor groove on either side of the protected major groove. In addition, the cleavage rates outside of the region of protein binding are affected: the bonds bet,ween nucleotides 32-33. 33-
Tn these binding
reactions
the DNA
was at
a
concentration of 1 x lo-’ M and the protein was at’ the following concentrations: lane 1, 0; lane 2, -7 X 1OF’ 111; lane 3. 1 x 10e6 M; lane 4, 2.5 x 10m6 M; lane 5. .i X 10e6 M. The lanes marked G are marker tracks showing the location of guanine residues in the sequence and are produced using G-track reactions from Maxam & Gilbert sequencing. The analysis was carried out in a 20% denaturing polyacrylamide gel and the gel was exposed wet at, -70°C. (b) Hydroxyl radical difference probability plot for the AQllzf-DNA complex. The plot was calculated from lanes 1 and 2 for both the top and bottom strands. In each case the bond cut is to the 3’ of the base values show bonds that are numbered. Negative protected. Values around -@2 indicate a reactivity that is close to that of the naked DNA because the ASllzfDNA complex is cut less than the naked DNA.
DNA-binding
by a Two ZincYfinger
01234
34 and 39-40 on the top strand are cleaved at an enhanced rate, perhaps due to a slight change in the DNA structure upon protein binding. (iii)
Methylation
protection
359
Peptide
01234
studies
In common with the hydroxyl radical dimethylsulphate is a small probe in comparison to DNase I. Dimethylsulphate methylates the N-7 atoms of guanine residues that lie in the major groove of DNA and at a much slower rate the N-3 atoms of adenine residues that lie in the minor groove (Ogata & Gilbert, 1978, 1979). A9Ilzf protects the equivalent of guanine residues 22 and 23 on the top strand of the 49 bp binding site from methylation by dimethylsulphate (Fig. 6). The extent of methylation of in the protein-DNA complex is half that of the naked DNA. This is consistent with the data from Brown et al. (1991), who find that methylation of these two conserved guanine residues in other Tramtrack binding sites in theftz zebra element (see Table 1) prevents binding of the protein to DNA. On the bottom strand there is enhanced methylahon of an adenine at nucleotide 25. This is most likely caused by a change in the structure of the DKA upon binding of the protein. There is no protection against methylation of the guanine (nucleotide 28) located on the bottom strand, within the DNase 1 and hydroxyl radical footprints. (iv) Interpretation
of footprinting
Strand A
results
Previous data for other zinc-finger proteins (Sakonju & Brown, 1982; Fairall et al., 1986; Pavletich & Pabo, 1991) indicate that zinc-fingers bind in the major groove of DKA. The protection from methylation of t’he two guanine residues corresponding to base-pairs 22 and 23 on the 49 bp fragment is consistent with A911zf binding in the major groove. Hence the observed protection from DNase I can be explained: the binding of A91 lzf in the major groove in the region of nucleotides 18 to
Strand B
Figure 6. Methylation protection by A91 lzf bound to a 100 bp restriction fragment. Strand A, the top strand; Strand B, the bottom strand. Lanes 1 and 2 contain DPU’A at a concentration of 1 x 10-a M reacted with dimethylsulphate for 3 and 6 min, respectively. Lanes 3 and 4 contain A911zf at a concentration of 1 x IOm6 M bound to D;“U’A at a concentration of 1 x 10-s M, also reacted with dimethyl-
sulphate for 3 and 6 min, respectively. The analysis was carried out in a 8% denaturing polyacrylamide gel, which was dried, and autoradiography carried out at -70°C.
Table 1 Tramtrack
binding
sites 2-finger DNase I protection
ftzlTSE
t CTAATAAGGATAACGTCCAT
t
ftzzl ftzz2 ftzz3
GTTGCCAGGACCTCGGATA GCGCAGGGATATTTATGCGC GCCTGCAAGGACATTTCGCC
(&own
BSC
TTGTGAGCGGATAACAATTCCAC
(Harrison & Travers, 1999)
(Harrison & Travers, 1909) (Harrison & Travcrs, 1990) et al., 1991)
By homology “RIS4A” Consensus
ACGCAAGGATCTTTGCGGG
M. Fortini & G. M. Rubin (personal communication)
ygcaaGGAtaty
Binding sites for the Drosophila melawgaster Tramtrack (Ftz F2) cleavage within the DNase I footprint of protein-DNA and A91 lzf text). One additional site (RUS4A) is aligned by homology. This site the conserved sequence sequence 4A in the promoter region of the conserved GGA triplet is shown in bold.
protein aligned on the prominent peptide-DNA complexes (see the is contained within an oligomer of Drosophila rhopdopsin gene. The
L. Fairall
360
et al.
Figure 7. A summary of the sites of protection and exposure in the ASllzf-DNA complex plotted on a cylindrical project,ion of a schematic DNA double helix. The base-pairs are drawn across the minor groove. 1, DP;ase I cutting sites. (0) Contact points to the phosphate groups that are required for binding of Dru’ase I in order to cut the DNA (Suck et al.. 1988) and hence are regions of exposure to DNase I. The dark shaded areas indicate the regions showing the peaks of protection from the hydroxyl radical and the lighter shaded areas indicate regions of weaker protection from the hydroxyl radical. The guanine residues shown in white lettering are protected from dimethylsulphate modification by A91 lzf.
24 of the top strand and 25 to 29 of the bottom strand hinders the binding of DNase I to the minor groove of the DNA either side of that region, and so gives protection over a large stretch of DNA. Thus, the DNase I footprint is consistent with the binding of the DNA-binding domain of Ttk in the major groove and the stongest protection indicates that it may be interacting primarily with seven nucleotides
[protein] x IO.’ M n9A7654321
of the top strand. The data from the hydroxyl radical footprinting studies are consistent with the DNase I footprinting studies and can also be explained in a similar way because it appears that the hydroxyl radical, like DNase 1, attacks the minor groove (Burkhoff & Tullius, 1987). There is strong protection from the hydroxyl radical of the phosphate backbone either side of the major groove
[protein] x 10.’ M 09876.54321
[protein]x 10.’ M 0987654321
(al Fig. 8.
DNA-binding
by a Two Zinc-jinger
and weaker protection across the minor groove on either side of the strong protection in the major groove. In conclusion, all the footprinting data are consistent with the two zinc-fingers of Ttk binding in the major groove of DNA to a region of seven nucleotides of the top strand but also binding to the bottom strand in the region located directly across the major groove. (v) Identijcation
of the minimal
DNA-binding
Peptide
361
with increasing oligonucleotides were titrated amounts of A911zf and the binding constant was taken to be equivalent to the protein concentration, which gives 50% complex (see Materials and Methods) as visualized in band-shift gels (Fig. 8(a)). The relative K, values obtained are shown in Figure 8(b) together with the oligonucleotides used in the measurements of dissociation constants. It should be noted that from these experiments one can only determine relative dissociation constants, because these studies were performed in the presence of both non-specific competitor DNA (Johnson et al., 1979) and the detergent NP40. We found that if either of these components was removed from the binding assay no binding was observed, presumably due to loss of protein through adhesion to surfaces of the Eppendorf tubes and tips
site
The footprinting data presented above suggest that the binding site of the DNA-binding domain of Ttk, Agllzf, is of the order of 12 bp (nucleotides 18 to 29 inclusive: Fig. 7). To establish the minimal binding site of Agllzf, which retains full binding afinity, relative dissociation constants for a range of binding site lengths were measured. The different
Relative KD (M) 1
2
b\\\\\\\\\\\\\\\1
TCCGACATCAACATCATAACGTCCATTAACAATGATCGGTT
AT 4TAAGGATAAcG’ A IYATTCCTATTGC
XXTA ZGTATA
4 x 1o-7
4 x 1o-7
5 x 1o-7
4TAAGGATAACG I’ATTCCTA’M’GC.
4
ITAAGGATAACG’ L’ATTCCTA’M’GC
5
\TAAGGATAACG’ l?ATTCCTATTGC
6-7 x 1O-7
6
ITAAGGATAACG ATTCCl’A!MGc;
6-7 x 1O-7
7
TAAGGATAACG’ ATTCCTATTGC;
9 x 10-7
8
ITAAGGATAACG L’A?M’CCTAmC
6 x lo+
9
iTAAGGATAAC A’M’CCTATT’GT
6 x 1O-6
10
11
7 ;
AAGGATAACG TTCCTA’MGC TAAGGATAAC TTCCTATTGA
(b) Fig. 8.
4-6 x 1o-7
362
L. Fairall et al.
1234567
Figure 8. Binding of A91 lzf to binding sites of different length. (a) Measurements of dissociation constants to different length oligonucleotides. The oligonucleotides are numbered as for (b) and the protein concentration for each binding reaction is at the top of the gels. (b) Dissociation constants for different length oligonucleotides of the A91 lzf binding site measured from the band-shift experiments shown in (a). The hatched boxes shown on oligonucleotide 1 are the extent of the DNase I footprint. The shaded boxes shown on oligonucleotide 2 are the regions protected from DNase I after subtraction of the phosphate contacts required for binding of DNase I. The broken lines delimit the region of the binding site required before there is a loss of binding affinity. (c) Competition experiments. Lane 1 shows the naked DNA. In each binding reaction the concentration of the 21 bp binding site was 1 x 10m6M and the protein was at w 1 x lop6 M with the addition of: lane 2, no competitor; lane 3, 50 pg poly[d(l-C)]/ml; lane 4, 100 pg/poly[d(I-C)]/ml; lane 5, 500 pg poly[d(IC)]/ml; lane 6, 1 pg poly[d(I-C)]/ml; 1ane 7, 3.6 x lop5 ivr-unlabelled 21 bp binding site.
used for pipetting, or aggregation of the protein. It is also likely that the exact composition of the complex in solution will not be accurately represented in the gel due to: dilution of samples upon loading into the well of the gel; separation of the components in the gel is likely to disturb the equilibrium; the process of electrophoresis may be slightly destructive. It can be seen that A91 lzf binds to the 49 bp and 21 bp oligonucleotides containing the Ttk binding site with the same affinity, and with a dissociation constant of - 4 x lo-’ M (Fig. S(b)). This value is similar to the one obtained from the DNase I footprinting studies using the 49 bp DNA fragment (see above). The shortest oligonucleotide that A91 lzf can bind without a significant loss of binding affinity is 11 bp in length with an overhanging adenine at each 5’ end. Further reductions in length results
in an increase in dissociation
constant:
9
x
10e7
M
for the deletion of the adenine at the 5’ end of the top strand and 6 x 10e6 M for the deletion of the adenine at the 5’ end of the bottom strand. This result is in close agreement to the binding site deduced from the DNase I and hydroxyl radical footprinting experiments. (vi) Competition
Since appeared example,
the
with poly[d (I-C)
/
dissociation constant of 4 x lo-’ M rather high, in comparison with, for the value of 6 x lo-‘M for the Zif268-
DNA complex (Pavletich & Pabo, 1991), we tested the binding specificity of ASllzf in a competition band-shift assay between the radioactively labelled 21 bp DNA binding site and different a.mounts of poIy[d(I-C)]. In this experiment both the protein A91 lzf and the 21 bp binding site were at a concentration of 1 x 10m6M and the concentration of -!jXlo-6M t0 poly[d(I-C)j was varied from
1 x 10e4 M (w/w as compared with the defined 13 bp binding site), a 5 to lOO-fold excess over the specific DNA binding site. In a control experiment, unlabelled 21 bp fragment at a concentration of 4 x 10e5 M was also used as a competitor. As can be seen in Figure 8(c), poly[d(I-C)], at a concentration 109fold higher than the dissociation constant, has no effect on the specific binding of A9llzf to the 21 bp DNA binding site. In contrast, the unlabelled 21 bp binding site competes for binding and no radioactive protein-DNA complex can be observed. So although the dissociation constant appears rather high, the DNA-binding domain of Ttk binds highly selectively to its binding site in the ftx upstream element.
4. Discussion We have shown that a monomeric 66 amino acid peptide from the transcription regulatory protein Tramtrack (A911zf) is sufficient to direct sequence-
DNA-binding
by a Two Zinc-$nger
specific DNA-binding. This DNA-binding domain consists of two zinc-finger motifs and a sequence of seven amino acids N-terminal to the first zinc-finger motif. Thus, the DNA binding domain extends beyond the zinc-finger sequence motifs. Similarily, three zinc-fingers of the yeast transcription factor SW15 require amino acids N-terminal to the first zinc-finger for stable folding. In the two-dimensional nuclear magnetic resonance studies of the SW15 zinc-fingers it can be seen that the first seven residues of the 11 amino acids N-terminal to the first zinc-finger domain form a third strand to the antiparallel b-sheet of that zinc-finger (D. Neuhaus & Y. Nakaseko, personal communication). These suggest that the amino acids observations N-terminal to the first zinc-finger of Ttk, which are essential for DNA binding, may play an essential role in the structure of the first zinc-finger domain. The DNA binding domain of the yeast transcription factor ADR I also contains residues N-terminal to the first of the two zinc-finger domains (Thukral et al., 1991). In contrast, a 57 amino acid peptide containing only two zinc-finger motifs, from the human enhancer binding protein MBP-1, is sufficient for sequence-specific recognition (Sakaguchi et al., 1991) as is a 90 amino acid peptide, containing only three zinc-finger motifs, from Zif268 (Pavletich & Pabo, 1991). These findings highlight differences in zinc-finger structure, which may in turn determine the mode of binding to DNA. We have expressed in E. coli and purified the fully active 66 amino acid DNA-binding domain of Ttk and have analysed the DNA-binding site for this peptide in detail. The DNase I footprint of the twofinger peptide is similar in two respects to that of the full length protein. First, the size of the DNase I N 21 bp, is essentially the same as the footprint, footprint observed both with Drosophila embryonic extracts (Harrison & Travers, 1990) and with the purified protein (Brown et al., 1991). Second, both the 66 residue peptide and the full length Ttk fail to protect the TA step at position 25-26 on the top strand from cleavage. Also we observe methylation protection of the two conserved guanine residues in the Ttk binding site (nucleotides 22 and 23 on the top strand). This is consistent with the data from Brown et al. (1991), who find that methylation of these two conserved guanine residues in other Ttk binding sites in the ftz zebra element (see Table 1) prevents binding of the protein to DNA. It would thus appear that the 66 amino acid A9llzf peptide interacts with the target DNA sequence in a very similar manner to full-length Ttk. The size of the binding site of the protein on the was further investigated by measuring DNA relative dissociation constants between A91 lzf and a range of oligonucleotides of different length. The shortest double-stranded oligonucleotide to which A911zf binds without loss of binding affinity is 11 bp, with an overhanging adenine at each 5’ end. The length of this minimal binding site for the two zinc-fingers of Ttk contrasts with the findings for the murine zinc-finger protein Zif268. The crystal
Peptide
363
structure of the complex between the DNA-binding domain of Zif268 and DNA, contains a 10 bp oligonucleotide with an extra nucleotide at each 5’ end. Zif268 binds to this oligonucleotide with a reported dissociation constant of 6 x 10m9 NI (Pavletich & Pabo, 1991). In this case, three zinc-fingers are binding to a 10 bp fragment with an extra nucleotide at each 5’ end, whereas A91 lzf requires a longer binding site of 11 bp with an extra nucleotide at each 5’ end for two zinc-fingers. This difference in the size of binding site suggests that the mode of binding of the two zinc-fingers of A91 lzf differs from that of Zif268. The results from the DNase I footprinting, hydroxyl radical footprinting and methylation protection studies are consistent with the binding of two fingers of Ttk in the major groove of the DNA with one finger following the other. The linker does not cross the minor groove as proposed for every alternate linker in TFIIIA (Fairall et al., 1986), although it is homologous in sequence to the type that is proposed to cross the minor groove both DNA (Kochoyan et al., 1991). Although strands are protected from DNase I cleavage over a region of about 20 bp, the strongest protection is on one of the two DNA strands. When one takes into consideration the phosphate contacts required for DNase I to cleave the DNA at the borders of the protected region a binding site for A91 lzf of seven nucleotides is delimited, from nucleotide 18 to nucleotide 24 on the top strand. This region shows a split peak of protection from hydroxyl radical attack, which could correspond to the binding of the two finger domains to the top strand. The proximity of the protein to the DNA in this region is further demonstrated by the protection of guanine residues 22 and 23 from methylation by dimethylsulphate. The size of the protection is consistent with the mode of binding observed for Zif268, in which all contacts are essentially with one of t#he two DNA strands, with each finger essentially covering three base steps and in which there is an additional contact by a C-terminal amino acid to a phosphate 5’ to the binding site in a neighbouring DNA molecule (Pavletich & Pabo, 1991). However, in addition to the protection on the top strand we observe on the bottom strand a third peak of protection from hydroxyl radical and DNase T, the bonds between nucleotides 26 to 29. This protection on the bottom strand is located directly across the major groove from the strong area of protection on the top strand (the bonds between nucleotides 21 to 24). Thus, the interpretation derived from footprinting studies is in agreement with the length and sequence of the shortest oligonucleotide required to retain full DNA binding activity. These observations suggest that either one of the fingers of the Ttk DNA binding domain interacts with both DNA strands in the same region of the major groove or that the linker may contact the other strand opposite the principal finger contacts. It seems more likely that one of the fingers is contacting both strands. The lack of protection of
364
L. Fairall
guanine 28 on the bottom strand from dimethylsulphate suggests that only the phosphate backbone may be involved in contacts with the protein in this region of the binding site. The N-terminal finger of Zif268 binds to the 3’ end of the DNA-binding site and the C-terminal finger binds to the 5’ end of the binding site. If it is assumed that the fingers of A911zf bind in the same orientation to that observed for Zif268 then finger one could be binding to the region around nucleotides 21 to 24 on the top strand with interactions to the region around nucleotides 26 to 28 on the bottom strand, and finger two could be binding to the region around nucleotides 18 to 20 on the top strand. This contrasts with the Zif268 structure in which each of the three finger domains interacts with the DNA in a very similar manner. The different mode of binding for the two fingers of Ttk could arise from the role of the N-terminal residues to finger one or because the sequence linking the two fingers is very different to that in Zif268. The basis of the sequence specificity of Ttk protein is not easily understood, since it binds to several related but different sequences in the promoter region of fushi tarazu (see Table 1). The major DNase I cleavage within the protected region (at the TA step) is a feature common to all characterized binding sites. This cleavage site thus provides an internal reference point for the alignment of Ttk binding sites. When the sites are aligned on this basis, the only absolutely conserved sequence that is apparent is the trinucleotide GGA corresponding to base-pairs 22, 23 and 24 (of the 49 bp fragment) (Table 1). These bases lie in the region protected from DNase I cleavage by 6911zf. We have found that A91 lzf protects the conserved guanine residues in the ftz upstream element from methylation by dimethylsulphate and these guanines, when methylated, have also been shown to interfere with the binding of Ttk to sites in the zebra element of ftz (Brown et al., 1991). The “consensus” binding sequence derived from this alignment differs significantly from that proposed by Brown et al. (1991) on the basis of sequence homology. From crystallographic studies and in vitro mutagenesis it has been deduced that each zinc-finger of the three-fingered Krox-20/Zif268 protein spans 3 bp and that’ the binding sites for successive fingers are contiguous (Nardelli et aE.> 1991; Pavletich & Pabo, 1991). Although no pattern of six successive conserved base-pairs is apparent in the Ttk consensus binding site, the conserved GGA is surrounded by semiconserved bases. In the structure of Zif268-DNA complex essentially all the contacts are to one of the two DNA strands; each finger makes two contacts to the bases and there.are three positions in the protein sequence involved in making these contacts. The amino acids involved are spaced three amino acids apart on one face of the DNA recognition helix. The DNA binding site for each finger can be said to have three potential bases that may be involved in the specific contacts to the protein (see Neuhaus & Rhodes,
et al. 1991). Fingers 1 and 3 of Zif268 have the same pattern of specific contacts with arginine residues at the first and third positions, making contact to guanine residues one nucleotide apart. Finger 2 has an arginine at the first position, which also makes contact to a guanine and a histidine at the second position, which makes contact to an adjacent (5’) guanine. It therefore appears that there may be a general mechanism for zinc fingers to bind to DNA with each position in the binding site always making contact to the same position in the protein sequence and structure. If we assume that the zinc-fingers of Ttk use the same general mechanism for binding we can obtain an alignment that fits with all of the binding data described in this paper and which results in a number of the contacts seen for Zif268 being conserved. First, in order to define the binding site accurately we make use of the phosphate contacts required for DNase I to cleave in the observed positions. Then, because most of the residues involved in making phosphate contacts are conserved between Zif268 and Ttk we have used them to position the Ttk fingers within the DNase I footprint. In this way the Ttk zinc-fingers can be aligned so that the histidine (position 59 in the Ttk sequence in Fig. 1) of finger 2 is contacting the phosphate between nucleotides 18 and 19, then finger 1 could be recognising G23, A24, T25 and finger 2 A20, A21, G22. The arginine at position 28 (finger 1) and the arginine at position 52 (finger 2) in the Ttk sequence would then be in a suitable position to contact G23 and G22, respectively, as in the Zif268 complex. The asparagine residues at positions 25 and 55 would then be in a position to contact A24 and A21, respectively. Hydrogen bonds bet,ween asparagine and adenine residues are present in the two homeodomain complexes (Kissinger et al., 1990; Wolberger et al., 1991). The adenine at nucleotide 21 is not completely conserved but there may be some redundancy between zinc fingers and their binding sites. This alignment also places finger 1 of Ttk in the correct position for the potential third p-strand of t,his finger to contact the other DNA strand at the location indicated by the footprinting data (D. Neuhaus, personal communication). In conclusion, we have defined a DNA-binding domain from the Drosophila transcription factor Tramtrack, A911zf: and have identified a minimal DNA binding site for this peptide. The results of the binding and footprinting studies presented here suggest’ that the mode of sequence-specific recognition for Ttk departs in detail from the only known structure of a zinc-finger-DNA complex, that of Zif268 (Pavletich & Pabo, 1991), resulting in a requirement for a longer binding site. This difference could reside in the essential role of the Nterminal a ino acids in sequence-specific binding. The charact ‘6, ization of a small and defined proteinDNA complex presented here will permit the determination of its three-dimensional structure either by X-ray diffraction of single crystals or by nuclear magnetic resonance spectroscopic techniques.
DNA-binding
by a Two
We thank Terry Smith and Jan Fogg for oligonucleotide synthesis, Lynda Chapman for her help with oligonucleotide purification and John Schwabe, Sarbjit Ner, Timm Jessen and Bob Dutnall for critical comments on the manuscript. This work was supported in part by a grant from the Human Frontier Science Programme.
References Berg,
J. M. (1988). Proposed structure for the zincbinding domains from transcription factor IIIA and related proteins. Proc. Nat. Acad. Sci., U.S.A. 85, 99%102. Brown, J. L., Sonoda, S., Ueda, H., Scott, M. P. & Wu, C. (1991). Repression of the Drosophila fushi tarazu (ftz) segmentation gene. EMBO J. 10, 665-674. Burkhoff, A. M. & Tullius. T. D. (1987). The unusual conformation adopted by the adenine tracts in kinetoplast DNA. Cell, 48, 935-943. Churchill, M. E. A., Tullius, T. D. & Klug, A. (1990). Mode of interaction of the zinc-finger protein TFIIIA with a 5 S RNA gene of Xenopus. Proc. Nat. Acad. Sci., U.S.A. 87, 5528-5532. Fairall, L., Rhodes, D. & Klug, A. (1986). Mapping of the sites of protection on a 5 S RNA gene by the Xenopus transcription factor IIIA. J. Mol. Biol. 192,
577-591. Gibson, T. J., Postma, J. P. M., Brown, R. S. and Argos, P. (1988). A model for the tertiary structure of the 28 residue DNA-binding motif (‘zinc-finger’) common to many eukaryotic transcriptional regulatory proteins. Protein Eng. 2, 209-218. Harrison, S. D. & Travers, A. A. (1990). The tramtruck gene encodes a Drosophila finger protein that interacts with the ftz transcriptional regulatory region and shows a novel embryonic expression pattern. EMBO J. 9, 207-216. Hiromi, Y.. Kuroiwa, A. & Gehring, W. J. (1985). Control element,s of the Drosophila segmentation gene fushi tarazu. Cell. 43, 603-613. Hope, 1. A. and Struhl, K. (1987). GCN4, a eukaryotic transcriptional activator protein, binds as a dimer to target DNA. EMBO J. 6, 2781-2784. Johnson, A. D.. Meyer, B. J. & Ptashne, M. (1979). Interactions between DNA-bound repressors govern regulation by the I phage repressor. Proc. Nat. dead. Sci., C’.S.A 76, 5061-5065. Kadonaga. tJ. T., Carner. K. R., Masiarz, F. R. & Tjian. R. (1987). Isolation of cI>NA encoding transcription factor Spl and functional analysis of the DNA binding domain. Cell, 51. 1079-1090. Kissinger, (:. R., Liu. B., Martin-Blanco, E.. Kornberg, T. H. 8: Pabo. C. 0. (1990). Crystal structure of a? engrailed homeodomain-DNA complex at 2.8 A resolution: a framework for understanding homeodomain-DNA interactions. Cell, 63, 579-590. Klevit. R. E., Herriott. J. R. & Horvath, S. J. (1990). Solution structure of a zinc-finger domain of yeast, ADRI. Proteins. 7, 215-226. Klup. A. & Rhodes. D. (1987). ‘Zinc-fingers’: a novel protein mot’if for nucleic acid recognition. Trends Biochem.
Sci.
12. 464469.
Kochoyan. M., Have], T. F., Nguyen, D. T., Dahl, C. E.. Keutmann, H. T. & Weiss, M. A. (1991). Alternating zinc-fingers in the human male associated protein ZFY: 2D NMR structure of an even finger and implications for “Jumping-Linker” DNA recognition. Biorhemistry, 30, 3371-3386. Layne. E. (1957). Spectrophotometric and turbidimetric
Zinc-$nger
Peptide
365
methods for measuring proteins. Methods Enzymol. 3, 447454. Lee, M. S., Gippert, G. P., Soman, K. V., Case, D. A. & Wright, P. E. (1989). Three-dimensional solution structure of a single zinc-finger DNA-binding domain. Science, 245, 635-637. Lutter, L. C. (1978). Kinetic analysis of deoxyribonuclease I cleavages in the nucleosome core: evidence for a DNA superhelix. J. Mol. Biol. 124, 391-420. Maxam, A. M. & Gilbert, W. (1980). Sequencing endlabeled DNA with base-specific chemical cleavages. Methods Enzymol. 65, 499-560. McLeod, M., Stein, M. & Beach, D. (1987). The product of the mei3+ gene, expressed under control of the mating-type locus, induces meiosis and sporulation in fission yeast. EMBO J. 6, 729-736. Miller, J., McLachlan, A. D. & Klug, A. (1985). Repetitive zinc-binding domains in the protein transcription factor IIIA from Xenopus oocytes. EMBO .I. 4, 1609-1614. Miller, J., Fairall. L. & Rhodes, D. (1989). A novel method for the purification of the Xenopus transcription factor IIIA. Nucl. Acids Res. 17. 9185-9192. Nagai, K., Nakaseko. Y., Nasmyth, K. & Rhodes, D. (1988). Zinc-finger motifs expressed in E’. coli and folded in vitro direct specific binding to DNA. Nature (London), 332. 284-286. Nardelli, J.. Gibson, T. J., Vesque. C. & Charnay, P. (1991). Base sequence discrimination by zinc-inger DNA-binding domains. Nature (London), 349, 175-178. Neuhaus, D. & Rhodes, D. (1991). Putting the finger on DNA. Curr. Biol. 1. 268-270. Ogata, R. T. & Gilbert. W. (1978). An amino-terminal fragment of lac repressor binds specifically to lac operator. Proc. h’at. Acad. Sri., i7.S.A. 75,
5851-5854. Ogata, R. T. & Gilbert, W. (1979). DNA-binding site of lac repressor probed by dimethylsulphate methylation of lac operator. J. Mol. Biol. 132, 709-728. Omichinski, J. ct.. Clore, G. M., Appella, E., Sakaguchi, K. & Gronenborn, A. M. (1990). High-resolution three-dimensional structure of a single zinc-finger from a human enchanter binding protein in solution. Biochemistry. 29, 93269334. Pavletich. N. P. & Pabo, C. 0. (1991). Zinc-finger-DNA recognition: crystal structure of a Zif’268-DNA complex at 2.1 A. Science, 252. 809-817. Rhodes. D. (1989). Analysis of sequence-specific DNAbinding proteins. In Protein Function: ‘4 Practical Approach. pp. 177-198, IRL Press, Oxford. Riggs, A. D.. Suzuki, H. & Bourgeois. S. (1970). lac repressor-operator interaction: I. equilibrium studies. J. Yk’ol. Biol. 48, 67793. Sakaguchi, K.. Appella, E.. Omichinski. tJ. U., Clore, G. M. & Gronenborn. A. M. (1991). Specific DNA binding to a major histocompatibility complex enhancer sequence by a synthetic 57.residue double zinc-finger peptide from a human enhancer binding protein. J. Biol. Chem. 266. 7306-7311, Sakonju, S. & Brown. D. D. (1982). (:ont.act points between a positive transcription fact,or and t,he Xenopus 5 S RNA gene. CelZ. 31. 395405. Smith. D. R.. Jackson. I. J. & Brown, I). 1). (1984). Domains of the positive transcription factor specific for the Xenopus 5 S RNA gene. C&. 37. 6445-652. Smith, J. M. & Thomas, D. J. (1990). Quantitative analysis of one-dimensional gel electrophoresis profiles. (‘ABIOS, 6. 93-99.
366
L. Fairall
Sproat, B. S. & Gait, M. J. (1984). Solid-phase synthesis of oligodeoxyribonucleotides by the phosphotriester method. In Oligonucleotide Synthesis-A Pmctical Approach, pp. 83-l 15. IRL Press, Oxford. Studier, F. W., Rosenberg, 8. H., Dunn. J. J. & Dubendorff, J. W. (1990). Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enrymol 185, 60-89. Suck, D.,OLahm, A. & Oefner, C. (1988). Structure refined to 2A of a nicked DKA octanucleotide complex with DBase I. Nature (London), 332, 464-468. Thukral. S. K.. Eisen, A. & Young, E. T. (1991). Two monomers of yeast transcription factor ADRl bind a palindromic sequence symmetrically to activate ADHZ expression. Mol. Cell. Biol. 11, 1566-1577.
et al. Tullius, T. D., Dombroski, B. A., Churchill, M. E. A. & Kam, L. (1987). Hydroxyl radical footprinting: a high-resolution method for mapping protein-DKA contacts. Methods Enzymol. 155, 537-558. Wolberger, C.. Vershon; A. K., Lui, B., Johnson, A. D. C Pabo, C. 0. (1991). Crystal structure of a MATa homeodomainoperator complex suggests a general model for homeodomain-DNA interactions. Cell, 67, 5 177528. Yanari, S. & Bovey, F.A. (1960). Interpretation of the ultraviolet spectral changes of proteins. J. Rid. Chem. 235. 2818-2826.
Edited by T. Richmond
-Vote added in proof. Tramtrack binding sites have been identified in the promoter region of the Drosophila gene even skipped. They are of the consensus sequence GCAGGACC (Read & Manley, 1992; EMBO J. 11, 1035-1044).