DNA-binding Peptides

DNA-binding Peptides

7.13 DNA-binding Peptides INDRANEEL GHOSH, SHAO YAO, and JEAN CHMIELEWSKI Purdue University, West Lafayette, IN, USA 6[02[0 INTRODUCTION 366 6[02[1 ...

872KB Sizes 0 Downloads 89 Views

7.13 DNA-binding Peptides INDRANEEL GHOSH, SHAO YAO, and JEAN CHMIELEWSKI Purdue University, West Lafayette, IN, USA 6[02[0 INTRODUCTION

366

6[02[1 DNA!BINDING PEPTIDES BASED ON PROTEIN MOTIFS 6[02[1[0 a!Helices in the Major Groove 6[02[1[0[0 HelixÐturnÐhelix peptides 6[02[1[0[1 Basic!helixÐloopÐhelix peptides 6[02[1[0[2 Basic!leucine zipper peptides 6[02[1[0[3 Peptides based on the zinc _n`er motif 6[02[1[1 b!Sheet Peptides in the Major Groove

366 366 367 379 371 373 375

6[02[2 PEPTIDE MINOR GROOVE BINDERS

375

6[02[3 PEPTIDE INTERCALATORS

376

6[02[4 CONCLUSIONS

377

6[02[5 REFERENCES

377

6[02[0 INTRODUCTION Watson and Crick\ upon deducing the structure of DNA\ were quick to point out that {{there is room between polynucleotide chains for a polypeptide to wind around the same helical axis[||0 Since that time many structures have been solved for proteinÐDNA complexes[ This information has been critical in understanding the nature of proteinÐDNA interactions and in advancing the design of peptides which make speci_c interactions with DNA\ through either major or minor groove contacts[ The focus of this chapter\ therefore\ is to enumerate the many peptides of approximately 59 amino acid residues or less with the ability to make speci_c interactions with double!stranded\ B!form DNA[

6[02[1 DNA!BINDING PEPTIDES BASED ON PROTEIN MOTIFS 6[02[1[0 a!Helices in the Major Groove Numerous DNA!binding proteins rely on interactions between a!helical portions of the protein with the DNA major groove[ A variety of structural motifs have been identi_ed within a!helical DNA!binding proteins\ including the helixÐturnÐhelix "HTH#\ basic!helixÐloopÐhelix "bHLH#\ basic leucine!zipper "bZip#\ and zinc _nger motifs[1Ð5 The peptides of these motifs are either small full! length proteins of less than approximately 59 amino acids\ or are composed of truncated forms of the DNA!binding protein which contain either the entire binding motif or smaller portions of the motif containing residues responsible for sequence!speci_c interactions with DNA[ 366

367

DNA!bindin` Peptides

6[02[1[0[0 HelixÐturnÐhelix peptides The HTH motif has been found in a wide range of prokaryotic and eukaryotic transcription factors[ The motif is composed of approximately 10 residues which fold into two helices\ known as helix!1 and helix!2\ which dock onto one another at a 019> angle with an intervening three! or four! residue turn "Figure 0#[6Ð8 The recognition helix\ helix!2\ is an integral part of the folded protein structure\ but the surface!exposed portions of this helix make contact with the edges of the base pairs in the major groove along with neighboring phosphodiester moieties[

Figure 0 Sequence alignment for HTH peptides[ Residues with stars interact with DNA[

The HTH motif has been the subject of many review articles[1Ð8 Our intent\ therefore\ is to focus on a few examples of small or truncated HTH proteins\ and\ where possible\ to delineate the point at which DNA binding is lost upon truncation[ Hin recombinase\ for example\ is composed of a DNA!binding domain which contains an HTH motif\ and a recombination domain which results in DNA inversion[09 Synthesis of a 41 amino acid residue peptide based on the Hin DNA!binding domain\ which is comprised of the HTH motif with an additional amino!terminal turnÐhelix "residues 028Ð089#\ resulted in a peptide of sequence that speci_cally bound to an oligonucleotide fragment containing the hixL cross!over site[00 The dissociation constant obtained for 41mer hixL half!site binding was approximately 1 mM compared with the 39 nM dissociation constant obtained with the full!length Hin protein[ A cocrystal structure of Hin recombinase with DNA con_rmed the presence of the HTH motif and elucidated the interactions between helix!2 and the major groove of DNA "Figure 1#[01 Similar results were obtained with the related protein resolvase recombinase^ a carboxy!terminal proteolysis fragment containing the HTH motif of resolvase!bound DNA with a dissociation constant of 9[4Ð1[9 mM compared with ³ 9[1 nM for the intact protein[02 The 41mer of Hin also inhibited Hin inversion by binding to the hixL site\ although this inhibition could be overcome by increasing the Hin concentration[00 Further truncation of Hin recombinase to a 20 amino acid peptide composed only of the HTH sequence "residues 059Ð089# resulted in a complete loss of sequence!speci_c DNA binding and no inhibition of Hin inversion[00 The 55 amino acid cro protein from l phage contains an HTH motif and recognizes the 06 base pair l phage OR2 sequence[03 l cro binds DNA as a noncovalent dimer with the recognition helices bound into two adjacent DNA half sites[ The full!length l cro\ which contains\ in addition to the HTH motif\ three b!strands and one helix\ has a dissociation constant of ³4 nm for the OR2 operator site[04 Dimerization can be inhibited by the addition of _ve residues to the C!terminus of l cro^ these additional residues were designed to bind the C!terminal b!strand of l cro and produce a folded monomeric version[04 The monomeric cro derived from this procedure was found to fold similarly to the wild!type dimeric protein as determined by circular dichroism and NMR\ but showed severely reduced a.nity for the OR2 site "×08 mM#[ Truncation of l cro to a peptide containing only the helixÐturnÐhelix motif "residues 05Ð24#05 or to a peptide containing just the recognition helix "residues 15Ð28#06 resulted in a loss of sequence!speci_c DNA binding[ The peptide

) DNA!bindin` Peptides

368

Figure 1 Cocrystal structure of Hin recombinase and DNA[01

containing the HTH motif was not a stable folding unit\ as determined by circular dichroism\ which showed a helical content of 01)[05 Attempts to stabilize the HTH structure in this peptide by incorporating a disul_de linkage at the interface of helix!1 and helix!2 between residues 19 and 29 led to only a small increase in the helical content "19)# and no enhanced DNA binding[05 The homeodomain is a 59 amino acid region of proteins which are expressed by the homeotic genes[ These genes specify the body plan and regulate development in higher organisms\ and the homeodomain\ which contains an HTH motif\ represents the DNA binding domain of the larger homeodomain proteins[07\08 Peptides corresponding to the full homeodomain of a number of proteins\ such as antennapedia\ engrailed\ even!skipped\ ultrabithorax\ fushi tarazu\ and MATa0:MATa1\ have been prepared and analyzed for sequence!speci_c DNA binding[ A 57 amino acid homeodomain peptide of the antennapedia protein "residues 186Ð252#\ for example\ was expressed and was found to bind as a monomer to an oligonucleotide containing an ATTA site with a dissociation constant of 0[5 nM[19 The solution structure of the 57mer con_rmed that antennapedia contained an HTH motif in addition to two other helices[10 A synthetic 59 amino acid homeodomain peptide had similar activity\11 whereas truncation of the homeodomain to a peptide containing only the HTH motif "residues 17Ð44 based on homeodomain# resulted in the loss of sequence!speci_c binding to DNA[06 A 50 residue peptide corresponding to the engrailed homeodomain was prepared and found to bind to an oligonucleotide containing a TAAT subsite with a dissociation constant of 0 nM[12 A cocrystal structure of the engrailed peptide with DNA con_rmed the presence of an HTH motif bound into the major groove of DNA at the TAAT site with an N!terminal tail binding into the minor groove of the DNA and an additional two helices providing stabilizing interaction with the recognition helix "Figure 2#[12 A similar structure of the 59 residue homeodomain peptide of the monomeric even!skipped "Eve# protein was also solved with an oligonucleotide containing an ATTA core sequence[13 Slightly longer peptide sequences containing the homeodomains of the ultrabithorax "61mer#14 and fushi tarazu "62mer#15 proteins were also found to bind to oligonucleotides containing TAAT sequences with dissociation constants of 9[0 nM and 9[6 nM\ respectively[ While most homeodomain peptides have been found to bind DNA as a monomer\ an interesting case of heterodimeric homeodomain DNA binding has been discovered[ The homeodomain peptide

CMYK Page 368

)

) 379

DNA!bindin` Peptides

Figure 2 Cocrystal structure of the engrailed homeodomain and DNA[12

"63mer# of the MATa1 protein binds to DNA as a monomer[16 A similar homeodomain peptide "46mer# derived from the MATa0 protein shows no detectable DNA binding[17 In the diploid a:a cell type\ however\ these proteins form a heterodimer which binds to sites upstream of the haploid! speci_c genes "hsg#[18Ð20 The truncated homeodomain peptides still maintain sequence!speci_c bind! ing to the hsg operator with a 09!fold reduction in a.nity compared with the full!length proteins[21 A cocrystal structure of the homeodomain peptides of MATa1 and MATa0 with an oligonucleotide containing a1 and a0 binding sites clearly illustrates the heterodimeric interactions between the two peptides[21 A C!terminal tail of MATa1 "residues 48Ð63# binds in a helical conformation between helices 0 and 1 of MATa0\ with helix 2 of both peptides binding into the major groove of DNA\ and the N!terminal arms binding into the minor groove\ as has been observed with other homeodomain proteins[ The results obtained with peptides containing the HTH motif point to a few conclusions con! cerning how far these peptides may be truncated before losing activity[ In all cases where sequence! speci_c DNA binding was maintained\ there was at least one other helix to provide stabilization to the HTH motif\ as with Hin recombinase\ for example[ Truncation to peptides containing only the HTH motif resulted in the loss of a well!de_ned conformation and also loss of speci_c DNA a.nity[ It should be possible to truncate the homeodomain proteins such that they only contain three helices and still maintain binding\ as has been observed with Hin\ but these experiments have not been carried out to date[ Also methods to stabilize the conformation of monomeric HTH sequences would have the potential to produce smaller DNA binding peptides based on this motif[

6[02[1[0[1 Basic!helixÐloopÐhelix peptides The bHLH motif is another highly conserved region in a number of transcription factors\ which is composed of approximately 59 amino acids[1Ð5\22\23 This motif is composed of a dimerization interface known as the helixÐloopÐhelix and a DNA!binding region composed of a number of basic residues[ In the case of the bHLH motif\ peptides which encompass just this region are able to bind

CMYK Page 379

)

) DNA!bindin` Peptides

370

DNA sequence speci_cally[ A few examples of peptides that have been prepared\ which encompass the bHLH motif\ are IEB E36\ myoD\ and USF "Figure 3#[

Figure 3 Sequence alignment for bHLH peptides[ Residues in bold interact with DNA[

IEB E36 is a bHLH protein which plays an important role in activating expression of the immunoglobulin light chain gene by binding to the kE1 enhancer site[24 The E36 bHLH peptide\ composed of residues 225Ð283 of the full!length protein\ also bound to an oligonucleotide containing the kE1 sequence "CAGGTG# with half!maximal binding occurring at approximately 4 mM[25 A cocrystal structure of an E36 homeodomain peptide "residues 224Ð281# with an oligonucleotide containing a CAGGTG sequence has been solved and con_rms the proposed bHLH fold[26 The two amphiphilic helices of E36 form the dimerization interface\ which is composed of a parallel\ four!helix bundle\ and the basic region forms a helical extension of helix 1 that binds into the major groove of the DNA "Figure 4#[

Figure 4 Cocrystal structure of the IEB E36 bHLH peptide and DNA[26

A peptide composed of the bHLH region of MyoD "residues 091Ð055# was bacterially expressed\ and was also found to bind to an oligonucleotide containing a CAGGTG site and a CACGTG site[27 By incorporating speci_c cysteine mutations into the MyoD peptide and performing cross! linking experiments\ a parallel orientation of the helices at the dimerization interface was proposed\ as was observed with E36[ Experiments have been performed to enumerate smaller peptides of MyoD with DNA binding capabilities[ By cross!linking and geometrically constraining two peptides corresponding to the basic region of MyoD with a C1 symmetric diol template\ it was determined

CMYK Page 370

)

371

DNA!bindin` Peptides

that agents containing the "R\R# and "S\S# stereochemistry formed a speci_c complex with an oligonucleotide containing the MyoD binding site[28 A monomeric uncross!linked MyoD peptide\ on the other hand\ showed no a.nity for the same DNA sequence[28 The USF protein is similar to other bHLH proteins in that it contains a bHLH motif\ but in addition it contains a leucine zipper domain "bHLHZ# to provide additional stabilization to the homodimer[ Both the bHLH "residues 086Ð159# and bHLHZ "residues 086Ð209# portions of USF have been prepared and their DNA a.nity evaluated[39 The bHLHZ peptide binds to an oligo! nucleotide containing a CACGTG sequence with a dissociation constant "0[2 nM# that was indis! tinguishable from the full!length protein[ The bHLH peptide did not form an electrophoretically stable complex with the same DNA\ but circular dichroism spectroscopy suggests that the bHLH peptide binds to DNA in a sequence!speci_c manner which is indistinguishable from the full!length protein[ A cocrystal structure was also solved for the complex of the bHLH peptide of USF and DNA\ and the overall protein fold and interaction with DNA is very similar to that observed with the bHLH peptide of E36[39 These experiments serve to illustrate that the bHLH peptides are a fully folded and functional protein motif[ The experiments with MyoD also demonstrate that the basic region\ as long as it is dimerized in the appropriate conformation\ can function as an autonomous DNA binding sequence[

6[02[1[0[2 Basic!leucine zipper peptides The bZip DNA!binding motif also relies on a dimerization region\ termed the leucine zipper\ to mediate the interactions of a highly basic domain with DNA[1Ð5\30 The core bZip motif is composed of approximately 59 amino acid residues "Figure 5"a##\ and studies of the bZip domains of a variety of proteins such as GCN3\ C:EBP\ and Jun have shown that these peptides maintain sequence! speci_c DNA binding[31Ð33 A cocrystal structure of the bZip regions of Fos:Jun with an oligo! nucleotide containing an AP!0 site has con_rmed the coiled!coil nature of the dimerization interface and has demonstrated that the basic region binds to the major groove of DNA in a helical\ scissor! grip binding mode "Figure 5"b##[34 New bZip peptides with altered DNA binding speci_cities have also been obtained by using an in vitro selection method[35 A library of 2[1×09−5 mutant bZip C:EBP peptides was prepared by randomizing _ve DNA!binding residues in the basic region\ and peptides of the library were selected for binding to either mutant or wild!type DNA sequences[ Mutant peptides were found which bind to the corresponding mutant DNA sequences with an a.nity similar to that of the wild type bZip C:EBP peptide for the wild type DNA[ A {{minimalistic|| approach has also been applied to the design of a peptide based on the GCN3 bZip domain[36 Residues of the basic region which were predicted to be nonessential for DNA binding were replaced with Ala residues\ and a de novo designed heptad repeat replaced a portion of the leucine zipper[ The bZip peptide obtained had only 32) sequence homology to GCN3\ but bound speci_cally to an oligonucleotide containing the TRE site[ Truncation of the bZip motif down to peptides containing solely the basic region have generally shown the importance of the dimerization domain for DNA binding[ A number of methods have been developed to cross!link the basic regions into a functional {{dimer[|| Disul_de cross!linking at the C!terminus of a basic region peptide of GCN3\ for instance\ resulted in a peptide which binds to an oligonucleotide containing the GCN3 recognition element "ATGACT# with a dissociation constant of approximately 09 nM at 3 >C\ but speci_c DNA binding was found to be temperature dependent[37 Further truncation of the basic region down to a dimer of a peptide containing as few as 19 residues of GCN3 resulted in DNA binding with a speci_city similar to the intact protein[38 Metal binding has also been reported as a means of dimerizing the GCN3 basic region[49 Incor! porating a terpyridyl!moiety at the C!terminus of the GCN3 basic region followed by dimerization of the peptide with addition of Fe1¦ produced a species which bound to an oligonucleotide containing a CRE binding site with a dissociation constant of 9[02 nM at 3 >C[ Ueno et al[ have employed the noncovalent interactions between b!cyclodextrin and adamantane as a means to dimerize the basic region of GCN3[40 Two peptides were prepared in which one peptide contained b!cyclodextrin at the C!terminus and the other contained an adamantyl unit[ DNA binding to an oligonucleotide containing an ATGACT site was only observed when a 0]0 mixture of the two peptides were in solution\ with approximately 49) of the DNA in the complexed form at 49 nM at 3 >C[

) DNA!bindin` Peptides

372

Figure 5 "a# Basic region sequence alignment for bZip peptides\ and "b# cocrystal structure of the Fos:Jun heterodimer and DNA[34

In an interesting set of experiments disul_de!linked peptides corresponding to the basic region of Jun have been prepared in which the cross!linking was performed in three di}erent ways] N! to N! termini\ C! to C!termini\ and N! to C!termini[41\42 The three cross!linked peptides each bound to oligonucleotides having the appropriate half!site orientation with a dissociation constant of approximately 3 nM at 3 >C[ Extension of this work to a peptide containing three cross!linked Jun basic regions\ which were disul_de!linked in the C! to C! to N!termini orientation\ also provided speci_c DNA binding to an oligonucleotide which contained three half!binding sites with a dis! sociation constant of approximately 4 nM at 3 >C[43 In an analogous set of experiments the basic regions of GCN3 and C:EBP were covalently cross!linked via C!terminal Lys residues to form either the homodimeric peptides or the heterodimeric peptide[44 The binding a.nity and speci_city of the homodimeric peptides mirrored the results obtained with the bZip peptides of GCN3 and C:EBP\ whereas the heterodimeric peptide bound speci_cally to an oligonucleotide containing both the GCN3 and C:EBP half sites[ In a recent study\ Goddard et al[ have demonstrated that a monomeric basic peptide from Jun displays a.nity for the AP0 site[45 Taylor et al[ have prepared peptides corresponding to a single GCN3 basic region with carboxamide cross!links to stabilize the helical conformation and have observed speci_c DNA binding[46 Although the binding obtained for these monomeric peptides is weaker than that observed for dimeric peptides\ these experiments open up the possibility of designing peptides with increased DNA a.nity based on monomeric peptide sequences[

CMYK Page 372

)

373

DNA!bindin` Peptides

6[02[1[0[3 Peptides based on the zinc _nger motif The zinc _nger motif appears to be one of the most widely used domains of DNA binding proteins[1Ð5\47\48 To date\ four cocrystal structures between zinc _nger proteins and DNA have been solved\ and numerous two!dimensional NMR structures have been solved for individual zinc _ngers[ Zinc _nger proteins are composed of three general classes[ The _rst class of zinc _nger motifs contains approximately 14 residues\ including two Cys and two His residues\ which fold into a compact unit composed of a helix packed against a b!hairpin[ The second class is composed of about 29 residues and contains four Cys residues[ Together two motifs of this class form a single structural unit with one dimerization helix and one recognition helix interacting in the major groove[ A third smaller family of zinc _nger motifs contains two zinc ions and six Cys residues[ As in the second class there is a dimerization and a recognition helix[ In this chapter the focus will be on determining the minimal zinc _nger sequence involved in sequence!speci_c DNA binding[ Peptides are included which contain from one to three zinc _nger units\ although the triple _ngers are longer than our arbitrary size limit of 59 amino acid residue peptides[

"i# Sin`le!zinc!_n`er peptides It was initially believed that peptides\ especially of the _rst class\ containing a single zinc _nger\ were not able to bind sequence speci_cally to DNA[ Recently two examples of single zinc _ngers with high!a.nity\ speci_c DNA binding have been reported[ In one case a peptide from the Drosophila transcription factor GAGA "_rst class# containing residues 209Ð261 was prepared and bound speci_cally to an oligonucleotide containing the core consensus sequence GAGAGA[59 The zinc _nger motif in this peptide\ however\ is ~anked on both termini by highly basic regions[ Removal of 08 amino acids on the C!terminus had no e}ect on DNA binding\ whereas removal of 16 residues from the N!terminus resulted in complete loss of sequence!speci_c DNA binding[ More structural information will be essential to determine if GAGA interacts with DNA in the same fashion as other zinc!_nger proteins of the _rst class[ In a second example a 48 amino acid peptide containing the C!terminal zinc _nger with its adjacent basic region derived from the erythroid transcription factor GATA!0 "second class# was synthesized and was found to bind sequence! speci_cally to an oligonucleotide containing a single GATA motif with an a.nity that was only an order of magnitude less than the full!length GATA!0 protein[50 Removal of six residues from the C!terminal basic region resulted in complete loss of DNA binding[ In both of these examples\ although the single zinc _nger is necessary for DNA binding\ the _nger alone is not su.cient for high!a.nity DNA recognition^ an adjacent basic region is also essential[ The examples where the zinc _nger alone has been used for DNA binding have shown nonspeci_c a.nity for DNA[

"ii# Double!zinc!_n`er peptides Peptide sequences containing one pair of zinc _nger motifs have been described with the ability to make sequence!speci_c interactions with DNA[ Two!_nger motifs from proteins of the _rst class have been prepared\ such as the Drosophila melano`aster regulatory protein Tramtrack\ a human enhancer binding protein\ MBP!0\ and the transcription factor SW04\ which contain 55\ 46\ and 69 amino acids\ respectively[ The Tramtrack double!zinc!_nger peptide binds in a sequence!speci_c manner to an oligonucleotide containing a natural target site with a dissociation constant of approximately 399 nM[51 A cocrystal structure has been solved for Tramtrack with DNA\ and each of the two zinc _ngers forms an independent DNA!binding domain with\ at the N!terminus\ helix residues binding into the major groove of the DNA "Figure 6#[52 Similarly\ the MBP!0 peptide was found to interact with an oligonucleotide comprising a portion of the major histocompatibility complex enhancer sequence\ with a dissociation constant of 039 nM[53 The SW04 peptide\ which contains two zinc _ngers\ maintained speci_c DNA binding\ although much higher peptide con! centrations were needed for binding compared to a three!zinc!_nger peptide from SW04[54 Two!_nger peptides from the second and third classes of zinc!_nger proteins have also been studied[ The 60!residue peptide fragment of the glucocorticoid receptor\55 the double!zinc!_nger peptide from the estrogen receptor\56 and a 43!residue peptide from the yeast transcriptional activator GAL357 "third class# bind speci_cally to DNA[ Interestingly\ however\ each double!zinc!

) DNA!bindin` Peptides

374

Figure 6 Cocrystal structure of the Tramtrack zinc _nger peptide and DNA[52

_nger motif of the second and third classes utilizes only one helix when interacting with the major groove of DNA\ and dimerization accounts for higher a.nity binding[ With a knowledge of the speci_c orientation of peptideÐDNA binding interactions\ it should be possible to design fused peptides with unique DNA binding speci_cities[ A novel DNA!binding peptide was designed in this way by covalently linking a 46!residue\ two!_nger peptide of Zif157 to the 50!residue homeodomain of the Oct!0 protein[58 The two!peptide fusion protein bound optimally "Kd of 9[7 nM# to an oligonucleotide containing adjacent homeodomain "TAATTA# and zinc _nger "TGGGCG# subsites[

"iii# Triple!zinc!_n`er peptides A number of peptides containing three zinc!_nger units have been shown to bind speci_cally to DNA[ The main reason for including triple!zinc!_nger peptides in this chapter is to point out the design work of Berg et al[ in which individual zinc _ngers from the _rst class of the motif were mixed and matched to obtain a desired DNA binding speci_city "Figure 7#[69 The peptide was designed to contain the consensus sequence of CP!0\ but the residues involved in DNA recognition were modi_ed in each of the three zinc _ngers[ The _rst _nger of the peptide was based on a mutant of the human transcription factor Spl and contained residues Gln02\ Asp05\ and Arg08[ The second _nger was based on the transcription factors Zif157 and Spl\ and contained residues Arg03\ Glu05\ and Arg08[ The third _nger was based on mutants of Spl and Krox!19\ and contained residues Arg02\ His05\ and Arg08[ The designed peptide bound sequence speci_cally to an oligonucleotide containing the predicted binding site 4?!GGG GCG GCT!2? with a dissociation constant of approxi! mately 1Ð2 nM[ This approach relies on the similarity of the DNA binding structure in the _rst class of zinc _nger proteins\ but has great potential for speci_cally recognizing any length and sequence of DNA at will[ In another set of experiments a library of sequences was used to determine if a code for zinc _ngerÐDNA interactions could be developed[60 A phage display library was prepared which con! tained 1[5×095 sequences based on three _ngers of Zif157[ This library was evaluated for binding to operator sequences in which the middle DNA triplet was altered[ The results obtained from this

CMYK Page 374

)

375

DNA!bindin` Peptides

Figure 7 A designed zinc _nger peptide shown with its cognate DNA sequence[

study highlight the fact that there are only three positions involved in the recognition of DNA\ and that only a limited set of amino acids are used in these positions[ In a similar fashion randomized DNA sequences were used to identify potentially coded zinc _nger sequences[61 The results of these studies were applied to the recognition of a speci_c oncogenic DNA site by a randomized zinc _nger peptide using phage display[62 A peptide sequence was obtained which bound speci_cally to the oncogenic site with a dissociation constant of 519 nM[

6[02[1[1 b!Sheet Peptides in the Major Groove A number of proteinÐDNA cocrystal structures have been solved\ for proteins such as TBP\ and the Arc and Met repressors\ in which the portion of the protein responsible for contacting the DNA exists to a large extent in a b!sheet conformation[63\64 The Arc repressor is composed of 42 amino acid residues and contains the essential folding unit of a ribbonÐhelixÐhelix structure "Figure 8#[65 Dimerization of Arc brings together residues 7Ð03 from each monomer to form a two!stranded antiparallel b!sheet which inserts into the major groove of an operator half!site[ By covalently cross! linking two Arc monomers with a peptide linker\ the a.nity for the half!site operator was increased from 074 pM for the wild type Arc to 0[6 pM for the cross!linked protein[66 To date\ there are no examples in the literature of cross!linked b!strand peptides to delineate the smallest folding unit for sequence!speci_c DNA binding[ Surovaya et al[ have designed constrained peptides which are composed of a b!strandÐturnÐb!strand motif either with or without disul_de cross!linking[67 In this case\ however\ potential DNA!speci_c residues were incorporated at the ends of the b!strands and in the turn\ and this region of the peptide is believed to interact with the minor groove\ as demonstrated by distamycin competition binding experiments[ Similarly\ a novel zinc ribbon motif has been found in the eukaryotic transcription elongation factor TFIIS[68 This structure is composed of a three!stranded b!sheet with a zinc binding site\ and a 44 amino acid residue peptide corresponding to the zinc ribbon motif has a preferred a.nity for oligopyrimidine single strands[ Whether it is the b!sheet portion of the zinc ribbon or the turns which interact with DNA is yet to be established[

6[02[2 PEPTIDE MINOR GROOVE BINDERS There exist two major classes of peptides which interact speci_cally within the minor groove of DNA[ One class is based upon the minor groove DNA!binding natural products distamycin and netropsin "Figure 09#[79Ð71 These crescent!shaped molecules consist of repeating units of pyrroles linked by amide bonds\ and bind in the minor groove of DNA at sites containing four or _ve successive A\T base pairs[ Many modi_cations to the natural structures have been made[ The other class of minor groove!binding peptides are those derived from larger proteins which contain repeating units of proline and positively charged residues\ termed generally the {{A\T!hook[||72 These peptides are proposed to have a crescent shape similar to distamycin and netropsin\ and interact in

) DNA!bindin` Peptides

376

Figure 8 Cocrystal structure of the Arc peptide and DNA[65

the minor groove with the backbone amides\ forming hydrogen bonds with the DNA bases in the minor groove[

Figure 09 Minor groove DNA!binding natural products[

A\T hook peptides derived from the yeast protein DAT0 and the nonhistone chromosomal protein HMG!I:Y\ for example\ have been shown to interact speci_cally in the minor groove of DNA[ A 24!residue peptide corresponding to amino acids 1Ð25 of DAT0 bound in the minor groove at positions containing A!T tracts with a dissociation constant of 9[3 nM[73 Within this peptide the sequence GRKPG is repeated three times\ and the Arg residues were shown to be essential for high! a.nity DNA binding[ In a similar fashion an 00 amino acid peptide derived from HMG!I:Y with the sequence TPKRPRGRPKK was shown to have the same binding characteristics of the intact protein\ and two!dimensional NMR experiments provided evidence that the RGR segment of the peptide is in contact with the minor groove[74 A number of other DNA!binding proteins contain regions with sequence similarity to the DAT0 and HMGÐI:Y peptides\ but further experimentation is needed to determine if these sequences are su.cient for high!a.nity\ sequence!speci_c DNA binding[

6[02[3 PEPTIDE INTERCALATORS A number of naturally occurring and designed peptides which contain two planar aromatic moieties have been found to interact with DNA by intercalation[ The quinoxaline family of anti! tumor antibiotics "Figure 00#\ for instance\ are a class of cyclic octadepsipeptides which contain two

CMYK Page 376

)

377

DNA!bindin` Peptides

quinoxaline moieties attached to the peptide chain[ Echinomycin and triostin A are two of the better known drugs of this family which have been shown to bisintercalate into DNA with the quinoxaline chromophores preferentially binding at CpG steps in the minor groove of a double helix[75Ð77 A functionally related class of cyclic decadepsipeptides including luzopeptins A!E\78Ð80 BBM!817A\81 quinaldopeptin\82 and sandramycin\83 which contain two pendant quinoxaline moieties\ have also been shown to interact with DNA by bifunctional intercalation[

Figure 00 Peptide intercalators of the quinoxaline family[

Naturally occurring sequences within proteins have been shown to interact with DNA by binding of speci_c tyrosine residues[ A tandem repeat of the sequence SPTSPSY\ for instance\ has been found in the largest subunit of RNA polymerase II with 15 units in yeast and 41 units in mammals[ Synthetic peptides containing two repeating units were found to bind DNA by intercalation of tyrosine residues[84 Synthetic linear peptides containing two aromatic moieties have been designed to interact with DNA via intercalation[85 A synthetic bis"acridine# containing a peptide of the sequence YKKG was found to bind to DNA by intercalation of both chromophores with a 039!fold enhancement of a.nity\ as compared to 8!aminoacridine\ whereas introduction of two p!NO1!Phe residues into a peptide with the sequence KFNO1AFNO1 also provided a peptide with bisintercalation properties[

6[02[4 CONCLUSIONS Most of the examples of DNA!binding peptides presented in this chapter are simply truncated regions of proteins whose sequences correspond exactly to the DNA!binding portions of the proteins[ As more structural information of proteinÐDNA interactions has become available\ however\ more modi_cations to DNA!binding motifs have been made[ These e}orts have led to peptides with unique DNA!binding speci_cities with the potential for therapeutic application[ It is hoped that more examples of de novo designed peptides are now within reach[

6[02[5 REFERENCES 0[ 1[ 2[ 3[ 4[ 5[ 6[ 7[ 8[ 09[ 00[ 01[ 02[ 03[ 04[ 05[ 06[

J[ D[ Watson and F[ H[ C[ Crick\ Nature\ 0842\ 060\ 853[ P[ F[ Johnson and S[ L[ McKnight\ Annu[ Rev[ Biochem[\ 0878\ 47\ 688[ T[ A[ Steitz\ Q[ Rev[ Biophy[\ 0889\ 12\ 194[ C[ A[ Pabo and R[ T[ Sauer\ Annu[ Rev[ Biochem[\ 0881\ 50\ 0942[ T[ Ellenberger\ Curr[ Opin[ Struct[ Biol[\ 0883\ 3\ 01[ S[ K[ Burley\ Curr[ Opin[ Struct[ Biol[\ 0883\ 3\ 2[ H[ C[ M[ Nelson\ Curr[ Opin[ Struct[ Biol[\ 0884\ 5\ 079[ S[ C[ Harrison and A[ K[ Aggarwal\ Annu[ Rev[ Biochem[\ 0889\ 48\ 822[ R[ G[ Brennan\ Curr[ Opin[ Struct[ Biol[\ 0880\ 0\ 79[ R[ T[ Sauer\ R[ R[ Yocum\ R[ F[ Doolittle\ M[ Lewis\ and C[ O[ Pabo\ Nature\ 0871\ 187\ 336[ M[ F[ Bruist\ S[ J[ Horvath\ L[ E[ Hood\ T[ A[ Steitz\ and M[ I[ Simon\ Science\ 0876\ 124\ 666[ J[ A[ Feng\ R[ C[ Johnson\ and R[ E[ Dickerson\ Science\ 0883\ 152\ 237[ S[ S[ Abdel!Meguid\ N[ D[ F[ Grindley\ N[ S[ Templeton\ and T[ A[ Steitz\ Proc[ Natl[ Acad[ Sci[ USA\ 0873\ 70\ 1990[ R[ G[ Brennan\ S[ L[ Roderick\ Y[ Takeda\ and B[ W[ Matthews\ Proc[ Natl[ Acad[ Sci[ USA\ 0889\ 76\ 7054[ M[ C[ Mossing and R[ T[ Sauer\ Science\ 0889\ 149\ 0601[ P[ Bishop and J[ Chmielewski\ unpublished results[ R[ Mayer\ G[ Lancelot\ and C[ Helene\ FEBS Lett[\ 0872\ 042\ 228[

DNA!bindin` Peptides

378

07[ W[ J[ Gehring\ Y[ Q[ Qian\ M[ Billeter\ K[ Furukubo!Tokunaga\ A[ F[ Schier\ D[ Resendex!Perez\ M[ A}olter\ G[ Otting\ and K[ Wuthrich\ Cell\ 0883\ 67\ 100[ 08[ W[ J[ Gehring\ M[ A}olter\ and T[ Burglin\ Annu[ Rev[ Biochem[\ 0883\ 52\ 376[ 19[ M[ A}olter\ A[ Percival!Smith\ M[ Muller\ W[ Leupin\ and W[ J[ Gehring\ Proc[ Natl[ Acad[ Sci[ USA\ 0889\ 76\ 3982[ 10[ G[ Otting\ Y[ Q[ Qian\ M[ Billeter\ M[ Muller\ M[ A}olter\ W[ J[ Gehring\ and K[ Wurthrich\ EMBO J[\ 0889\ 8\ 2974[ 11[ H[ Mihara and E[ T[ Kaiser\ Science\ 0877\ 131\ 814[ 12[ C[ R[ Kissinger\ B[ Liu\ E[ Martin!Bianco\ T[ B[ Kornberg\ and C[ O[ Pabo\ Cell\ 0889\ 52\ 468[ 13[ J[ A[ Hirsch and A[ K[ Aggarwal\ EMBO J[\ 0884\ 03\ 5179[ 14[ S[ C[ Ekker\ K[ E[ Young\ D[ P[ von Kessler\ and P[ A[ Beachy\ EMBO J[\ 0880\ 09\ 0068[ 15[ A[ Percival!Smith\ M[ Muller\ M[ A}olter\ and W[ J[ Gehring\ EMBO J[\ 0889\ 8\ 2856[ 16[ C[ Wolberger\ A[ K[ Vershon\ B[ Liu\ A[ D[ Johnson\ and C[ O[ Pabo\ Cell\ 0880\ 56\ 406[ 17[ C[ Goutte and A[ D[ Johnson\ J[ Mol[ Biol[\ 0882\ 122\ 248[ 18[ C[ Goutte and A[ D[ Johnson\ Cell\ 0877\ 41\ 764[ 29[ A[ M[ Dranginis\ Nature\ 0889\ 236\ 571[ 20[ C[ Goutte and A[ D[ Johnson\ EMBO J[\ 0883\ 02\ 0323[ 21[ T[ Li\ M[ R[ Stark\ A[ D[ Johnson\ and C[ Wolberger\ Science\ 0884\ 169\ 151[ 22[ C[ Murre\ G[ Bain\ M[ A[ van Kijk\ I[ Engel\ B[ A[ Furnari\ M[ E[ Massari\ J[ R[ Matthews\ M[ W[ Quong\ R[ R[ Rivera\ and M[ H[ Stuiver\ Biochim[ Biophys[ Acta\ 0883\ 0107\ 018[ 23[ S[ E[ V[ Phillips\ Structure\ 0883\ 1\ 0[ 24[ M[ Lenardo\ J[ W[ Pierce\ and D[ Baltimore\ Science\ 0876\ 125\ 0462[ 25[ P[ Bishop\ C[ Jones\ I[ Ghosh\ and J[ Chmielewski\ Int[ J[ Peptide Protein Res[\ 0884\ 35\ 038[ 26[ T[ Ellenberger\ D[ Fass\ M[ Arnaud\ and S[ C[ Harrison\ Genes Dev[\ 0883\ 7\ 869[ 27[ S[ J[ Anthony!Cahill\ P[ A[ Ben_eld\ R[ Fairman\ Z[ R[ Wasserman\ S[ L[ Brenner\ W[ F[ Sta}ord\ III\ C[ Altenbach\ W[ L[ Hubbell\ and W[ F[ DeGrado\ Science\ 0881\ 144\ 868[ 28[ T[ Morii\ M[ Simomura\ S[ Morimoto\ and I[ Saito\ J[ Am[ Chem[ Soc[\ 0882\ 004\ 0049[ 39[ A[ R[ Ferre!D|Amare\ P[ P[ Pgnonec\ R[ G[ Roeder\ and S[ K[ Burley\ EMBO J[\ 0883\ 02\ 079[ 30[ T[ Alber\ Curr[ Opin[ Genet[ Dev[\ 0881\ 1\ 194[ 31[ I[ A[ Hope and K[ Struhl\ Cell\ 0875\ 35\ 774[ 32[ J[ A[ Nye and B[ J[ Graves\ Proc[ Natl[ Acad[ Sci[ USA\ 0889\ 76\ 2882[ 33[ R[ Turner and R[ Tjian\ Science\ 0878\ 132\ 0578[ 34[ J[ N[ M[ Glover and S[ C[ Harrison\ Nature\ 0884\ 262\ 146[ 35[ T[ Sera and P[ G[ Schultz\ Proc[ Natl[ Acad[ Sci[ USA\ 0885\ 82\ 1819[ 36[ K[ T[ O|Neal\ R[ H[ Hoess\ and W[ F[ DeGrado\ Science\ 0889\ 138\ 663[ 37[ R[ V[ Talanian\ C[ J[ McKnight\ and P[ S[ Kim\ Science\ 0889\ 138\ 658[ 38[ R[ V[ Talanian\ C[ J[ McKnight\ R[ Rutkowski\ and P[ S[ Kim\ Biochemistry\ 0881\ 20\ 5760[ 49[ B[ Cuenoud and A[ Schepartz\ Science\ 0882\ 148\ 409[ 40[ M[ Ueno\ A[ Murakami\ K[ Makino\ and T[ Morii\ J[ Am[ Chem[ Soc[\ 0882\ 004\ 01 464[ 41[ C[ Park\ J[ L[ Campbell\ and W[ A[ Goddard\ III\ Proc[ Natl[ Acad[ Sci[ USA\ 0881\ 78\ 8983[ 42[ C[ Park\ J[ L[ Campbell\ and W[ A[ Goddard\ III\ Proc[ Natl[ Acad[ Sci[ USA\ 0882\ 89\ 3781[ 43[ C[ Park\ J[ L[ Campbell\ and W[ A[ Goddard\ III\ J[ Am[ Chem[ Soc[\ 0884\ 006\ 5176[ 44[ M[ Pellegrini and R[ H[ Ebright\ J[ Am[ Chem[ Soc[\ 0885\ 007\ 4720[ 45[ C[ Park\ J[ L[ Campbell\ and W[ A[ Goddard\ J[ Am[ Chem[ Soc[\ 0885\ 007\ 3124[ 46[ B[ Y[ Wu\ B[ L[ Ga}ney\ R[ A[ Jones\ and J[ W[ Taylor\ in {{Peptides] Chemistry\ Structure and Biology\|| eds[ P[ T[ P[ Kaumaya and R[ S[ Hodges\ May~ower Scienti_c Ltd[\ 0885\ p[ 154[ 47[ J[ M[ Berg\ Acc[ Chem[ Res[\ 0884\ 17\ 03[ 48[ A[ Klug and J[ W[ R[ Schwabe\ FASEB J[\ 0884\ 8\ 486[ 59[ P[ V[ Pedone\ R[ Ghirlando\ G[ M[ Clore\ A[ M[ Gronenborn\ G[ Felsenfeld\ and J[ G[ Omichinski\ Proc[ Natl[ Acad[ Sci[ USA\ 0885\ 82\ 1711[ 50[ J[ G[ Omichinski\ C[ Trainor\ T[ Evans\ A[ M[ Gronenborn\ G[ M[ Clore\ and G[ Felsenfeld\ Proc[ Natl[ Acad[ Sci[ USA\ 0882\ 89\ 0565[ 51[ L[ Fairall\ S[ D[ Harrison\ A[ A[ Travers\ and D[ Rhodes\ J[ Mol[ Biol[\ 0881\ 115\ 238[ 52[ L[ Fairall\ J[ W[ R[ Schwabe\ L[ Chapman\ J[ T[ Finch\ and D[ Rhodes\ Nature\ 0882\ 255\ 372[ 53[ K[ Sakaguchi\ E[ Appella\ J[ G[ Omichinski\ G[ M[ Clore\ and A[ M[ Gronenborn\ J[ Biol[ Chem[\ 0880\ 155\ 6295[ 54[ D[ Neuhaus\ Y[ Nakaseko\ K[ Nagai\ and A[ Klug\ FEBS Lett[\ 0889\ 151\ 068[ 55[ T[ Hard\ E[ Kellenbach\ R[ Boelens\ B[ A[ Maler\ K[ Dahlman\ L[ P[ Freedman\ J[ Carlstedt!Duke\ K[ R[ Yamamoto\ Ý [ Gustafsson\ and R[ Kaptein\ Science\ 0889\ 138\ 046[ J[!A 56[ J[ W[ R[ Schwabe\ L[ Chapman\ J[ T[ Finch\ and D[ Rhodes\ Cell\ 0882\ 64\ 456[ 57[ R[ Marmorstein\ M[ Carey\ M[ Ptashne\ and S[ C[ Harrison\ Nature\ 0881\ 245\ 397[ 58[ J[ L[ Pomerantz\ P[ A[ Sharp\ and C[ O[ Pabo\ Science\ 0884\ 156\ 82[ 69[ J[ R[ Desjarlais and J[ M[ Berg\ Proc[ Natl[ Acad[ Sci[ USA\ 0882\ 89\ 1145[ 60[ Y[ Choo and A[ Klug\ Proc[ Natl[ Acad[ Sci[ USA\ 0883\ 80\ 00 052[ 61[ Y[ Choo and A[ Klug\ Proc[ Natl[ Acad[ Sci[ USA\ 0883\ 80\ 00 057[ 62[ Y[ Choo\ I[ Sanchez!Garc(a\ and A[ Klug\ Nature\ 0883\ 261\ 531[ 63[ S[ E[ Phillips\ Annu[ Rev[ Biophys[ Biomol[ Struct[\ 0883\ 12\ 560[ 64[ B[ E[ Rauman\ B[ M[ Brown\ and R[ T[ Sauer\ Curr[ Opin[ Struct[ Biol[\ 0883\ 3\ 25[ 65[ B[ E[ Rauman\ M[ A[ Rould\ C[ O[ Pabo\ and R[ T[ Sauer\ Nature\ 0883\ 263\ 643[ 66[ C[ R[ Robinson and R[ T[ Sauer\ Biochemistry\ 0885\ 24\ 098[ 67[ A[ N[ Surovaya\ S[ L[ Gokhovskii\ R[ V[ Brusov\ Y[ P[ Lysov\ A[ L[ Zhuze\ and G[ V[ Gurskii\ Mol[ Biol[\ 0884\ 17\ 748[ 68[ X[ Qian\ C[ J[ Jeon\ H[ S[ Yoon\ K[ Agarwal\ and M[ A[ Weiss\ Nature\ 0882\ 254\ 166[ 79[ P[ G[ Schultz and P[ B[ Dervan\ J[ Biomol[ Struct[ Dyn[\ 0873\ 0\ 0022[ 70[ P[ B[ Dervan\ Science\ 0875\ 121\ 353[ 71[ C[ Zimmer and U[ Wahnert\ Pro`[ Biophys[ Mol[ Biol[\ 0875\ 36\ 20[

389 72[ 73[ 74[ 75[ 76[ 77[ 78[ 89[ 80[ 81[ 82[ 83[ 84[ 85[

DNA!bindin` Peptides

R[ Reeves and M[ S[ Nissen\ J[ Biol[ Chem[\ 0889\ 154\ 7462[ B[ J[ Reardon\ R[ S[ Winters\ D[ Gordon\ and E[ Winter\ Proc[ Natl[ Acad[ Sci[ USA\ 0882\ 89\ 00 216[ B[ H[ Geierstanger\ B[ F[ Volkman\ W[ Kremer\ and D[ E[ Wemmer\ Biochemistry\ 0883\ 22\ 4236[ A[ H[!J[ Wang\ G[ Ughetto\ G[ J[ Quigley\ and A[ Rich\ J[ Biomol[ Struct[ Dyn[\ 0875\ 3\ 208[ A[ H[!J[ Wang\ G[ Ughetto\ G[ J[ Quigley\ T[ Hakoshima\ G[ A[ van der Marel\ J[ H[ van Boom\ and A[ Rich\ Science\ 0873\ 114\ 0004[ G[ J[ Quigley\ G[ Ughetto\ G[ A[ van der Marel\ J[ H[ van Boom\ A[ H[!J[ Wang\ and A[ Rich\ Science\ 0875\ 121\ 0144[ H[ Ohkuma\ F[ Sakai\ Y[ Nishiyama\ M[ Ohbayashi\ H[ Imanishi\ M[ Konishi\ T[ Miyaki\ H[ Kosiyama\ and H[ Kawaguchi\ J[ Antibiot[\ 0879\ 22\ 0976[ M[ Konishi\ H[ Ohkuma\ F[ Sakai\ T[ Tsuno\ H[ Koshiyama\ T[ Naito\ and H[ Kawaguchi\ J[ Am[ Chem[ Soc[\ 0870\ 092\ 0130[ E[ Arnold and J[ Clardy\ J[ Am[ Chem[ Soc[\ 0870\ 092\ 0132[ C[!H[ Huang\ S[ Mong\ and S[ T[ Crooke\ Biochemistry\ 0879\ 08\ 4426[ S[ Toda\ K[ Sugawara\ Y[ Nishiyama\ M[ Ohbayashi\ N[ Ohkusa\ H[ Yamammoto\ K[ Konishi\ and T[ Oki\ J[ Antibiot[\ 0889\ 32\ 685[ D[ L[ Boger and J[!H[ Chen\ J[ Am[ Chem[ Soc[\ 0882\ 004\ 00 513[ M[ Suzuki\ Nature\ 0889\ 233\ 451[ C[ Robledo!Luiggi\ W[ D[ Wilson\ E[ Pares\ M[ Vera\ C[ S[ Martinez\ and D[ Santiago\ Biopolymers\ 0880\ 20\ 896[