d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/dnarepair
The enigmatic thymine DNA glycosylase ´ Daniel Cortazar, Christophe Kunz, Yusuke Saito, ¨ ∗ Roland Steinacher, Primo Schar Centre for Biomedicine, Department of Clinical Biological Research, University of Basel, Basel, Switzerland
a r t i c l e
i n f o
a b s t r a c t
Article history:
When it was first isolated from extracts of HeLa cells in Josef Jiricny’s laboratory, the thymine
Published on line 20 November 2006
DNA glycosylase (TDG) attracted attention because of its ability to remove thymine, i.e. a normal DNA base, from G·T mispairs. This implicated a function of DNA base excision repair
Keywords:
in the restoration of G·C base pairs following the deamination of a 5-methylcytosine. TDG
BER
turned out to be the founding member of a newly emerging family of mismatch-directed
TDG
uracil-DNA glycosylases, the MUG proteins, that act on a comparably broad spectrum of
SUMO
base lesion including G·U as the common, most efficiently processed substrate. However,
Deamination
because of its apparent catalytic inefficiency, some have considered TDG a poor DNA repair enzyme without an important biological function. Others have reported 5-meC DNA glycosylase activity to be associated with TDG, thrusting the enzyme into limelight as a possible DNA demethylase. Yet others have found the glycosylase to interact with transcription factors, implicating a function in gene regulation, which appears to be critically important in developmental processes. This article reviews all these developments in view of possible biological functions of this multifaceted DNA glycosylase. © 2006 Elsevier B.V. All rights reserved.
1.
Introduction
Within cells, the chemically unstable DNA is under permanent hydrolytic and chemical attack. Hydrolytic reactions occur at a significant rate and include the deamination of DNA bases with exocylic amino groups, i.e. cytosine (C) and 5-methylcytosine (5-meC), adenine (A) and guanine (G) [1]. Deamination of C and 5-meC generates uracil (U) and thymine (T) mispaired with guanine, respectively, both giving rise to C·G to T·A transition, unless repaired. While U is a foreign base in DNA and is easily recognized and repaired as such, the correction of a deaminated 5-meC, i.e. a T, requires a higher level of sophistication at damage recognition, since the “damage” in this case is a perfectly normal DNA base, except that it is mispaired. Such thoughts led Josef Jiricny and colleagues to search for a DNA repair function that processes T when mispaired with G to restore a canonical G·C base pair. In transfec-
∗
Corresponding author. Tel.: +41 61 267 0767; fax: +41 61 367 3566. ¨ E-mail address:
[email protected] (P. Schar). 1568-7864/$ – see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.dnarep.2006.10.013
tion experiments with G·T mismatched SV40 DNA they indeed identified a G·T directed repair activity in African green monkey kidney cells that efficiently replaced the T with a C [2]. The subsequent purification of a G·T binding and processing enzyme from nuclear extracts of HeLa cells and the molecular cloning of the respective cDNA eventually led to the discovery of the human thymine DNA glycosylase (TDG) [3–5], the first mismatch-specific DNA glycosylase to be described. Its ability to hydrolyze thymine and uracil from G·T and G·U mispairs in vitro [6] implicated a specific biological role in base excision repair (BER) of deaminated 5-meC and C, i.e. in countering deamination-induced C → T mutation. During the last decade, research on TDG has seen an impressive expansion into different disciplines. Enzymatic and structural studies provided insight into different aspects of its functionality. The identification and characterization of homologs and orthologs of species across the phylogeny shed
490
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
light on the evolution of this family of DNA glycosylases and facilitated first genetic approaches towards unraveling biological functions. While so far all efforts have failed to assign the human TDG to a well-defined cellular process, they have established lines of evidence that support three main working hypotheses. The biochemical and structural properties of TDG support a function in DNA BER of damaged or modified pyrimidine bases; biochemical and cell biological evidence has suggested a role in the active removal of 5-meC from methylated CpG dinucleotides in DNA; and protein–protein interactions have implicated TDG in the regulation of gene expression. What seems to be clear from all these studies is that, as a DNA glycosylase, TDG has some rather unusual features and that these may hint towards a link between DNA repair, the control of epigenetic DNA modification and the regulation of gene expression. Whether and how these seemingly divergent aspects of TDG function can be reconciled in a unifying mechanistic model remains to be addressed in the future. The objective of this article is to review the results of the last decade of research on TDG and to evaluate the emerging concepts for a biological function.
2. 2.1.
TDG—protein structure and enzymology Primary structure and the evolutionary aspects
The cloning and sequencing of the cDNA encoding the human TDG [5] facilitated the search for related proteins in other organisms. This disclosed a broad phylogeny with orthologs in bacteria (e.g. Escherichia coli) [7], yeasts (e.g. Schizosaccharomyces pombe), insects (e.g. Drosophila melanogaster) [8] and frogs (e.g. Xenopus laevis). All these proteins belong to the MUG branch of the superfamily of monofunctional uracil-DNA glycosylases (UDGs) that share a common ␣/-fold structure [9] (Fig. 1). Although the human TDG is the founding member, the family was named after the E. coli Mug protein, a mismatchspecific uracil-DNA glycosylase, to account for the fact that U rather than T processing is a common trait of these proteins. The MUG proteins have a simple domain architecture, they are composed of a conserved core that constitutes the active site, and non-conserved N- and C-terminal extensions of variable lengths (Fig. 2). Within their catalytic domains, all orthologs share between 37% and 52% amino-acid sequence identity but no significant similarities with members of other UDG families, e.g. UNG and SMUG proteins. The molecular masses of the MUGs range between 18 kDa and 46 kDa with one notable exception. Owing to its uniquely long N- and C-terminal sequences, the Drosophila ortholog (Thp1) is a remarkably sized DNA glycosylase of more than 191 kDa. Similarly large proteins with active DNA glycosylase domains have thus far been described only in plants (i.e. DEMETER, ROS1), where they appear to control cytosine methylation mediated gene silencing [10,11]. Recent studies shed some light on possible functions of the divergent N- and C-terminal domains of the eukaryotic MUGs. Variations in the composition and configuration of these termini appear to correlate with changes in substrate specificity, substrate interaction and the kinetics of base release; they may thus be there to modulate the enzymatic activity of
Fig. 1 – Evolutionary conservation of MUG proteins. (A) Shown are the clustered relationships between representative members of UDG superfamily. The MUG family is highlighted in green. Included are Homo sapiens TDG (hsTDG, accession no. Q13569), UNG2 (hsUNG2, P22674) and SMUG1 (hsSMUG1, Q53HV7); Xenopus leavis UNG (xlUNG, AAH72313), TDG (xlTDG, AAH77465.1) and SMUG1 (xlSMUG1, Q9YGN6); Drosophila melanogaster Thd1 (dmThd1, Q9V4D8) and Smug1 (dmSMUG1, Q9VEM1); Schizosaccharomyces pombe Ung1 (spUng1, O74834) and Thp1 (spThp1, O59825); Saccharomyces cerevisiae Ung1 (saccUng1, P12887); Escherichia coli Ung (ecUNG, P12295) and Mug (ecMUG, P0A9H1); Serratia marcescens Mug (smMUG, P43343); Streptomyces coelicolor MUG (scMUG, NP 625542) and UDGb (scUDGb, NP 626251); Archaeoglobus fulgidus UDG (afUDG; NP 071102); Thermotoga maritima UDG (tmUDG, NP 228321); Pyrobaculum aerophilum UDGa (paUDGa, NP 558739) and UDGb (paUDGb, NP 559226); Mycobacterium tuberculosis UDG (mtUDG, NP 335742); and Thermus thermophilus UDG (tthUDG, CAD29337). The tree was generated with the neighbor-joining algorithm of the MEGA 3.1 software applied to a multiple alignment produced with the ClustalW routine (Blosum matrix; pairwise alignments: Gap opening penalty 10, Gap extension penalty 0.1; multiple alignment: Gap opening penalty 10, Gap extension penalty 0.1).
TDG in a process-dependent manner [7,8,12,13]. The terminal domains of the human and murine TDGs were also found to mediate diverse physical and functional interactions with other proteins, including nuclear receptors and other transcriptional regulators. This emphasizes a possible speciesspecific role of these domains in targeting the glycosylase to specific DNA sequences in the genome where its activity is needed [14–20].
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
491
Fig. 2 – Schematic alignment of MUG proteins. Conserved sequences and protein motifs as well as known interactions with other proteins are shown as indicated in the legend. The highly conserved central domain harbors the sequence motifs G(I/L)NPG(L/I) and VMPSSSAR (hsTDG), representing critical residues of the active site. Similarities in the N-and C-terminal parts of the mammalian TDGs are confined to the SUMO-interaction motifs and the SUMOylation consensus motif VKEE. Bars on top of the human TDG designate the minimal sequence requirement for G·U or G·T processing. The predicted AT-hook motifs present in the mammalian and insect N- or C-termini may provide non-specific DNA binding capacity. Bars underneath the respective MUG orthologs indicate identified protein interactions. NR, nuclear receptors; androgen receptor, glucocorticoid receptor, progesterone receptor, peroxisome proliferator-activated receptor a, thyroid hormone receptor a, Vitamin D3 receptor.
2.2. Three-dimensional structure and implicated mechanisms A deeper understanding of structure–function aspects came with the resolution of the three-dimensional structure of the E. coli Mug protein. Laurence Pearl and collaborators compiled an ␣/-fold for Mug that closely resembles that of the functionally related UNG and SMUG proteins. In the light of the very limited amino-acid sequence homology between the members of these UDG families, this was a rather surprising finding. Crystals capturing Mug bound to its DNA substrate then confirmed that, like other UDGs, it utilizes a combined intercalation/nucleotide flipping mechanism for base recognition and binding [21,22]. A closer look into the active site, however, revealed some unique features of Mug. Whereas UNG and SMUG enzymes achieve a high substrate selectivity through strategic configurations of active site residues that establish specific contacts with the uracil [23–25], Mug forms a large hydrophobic catalytic cavity that accommodates a variety of pyrimidine and purine derivatives without contacting the base to be hydrolyzed [22]. Hence, MUG proteins have a comparably broad substrate spectrum that includes lesions as bulky as ethenoadducts of C and A
(Table 1) [8,12]. Another unique feature of Mug is its interaction with the complementary DNA strand opposite from the damaged base. Residues within the catalytic pocket (Gly143, Leu144 and Arg146) form a ‘wedge’ that intercalates into the DNA base stack from the minor groove and occupies the space of the substrate base. This wedge establishes specific hydrogen-bonding interactions with the widowed G in a configuration that mimics Watson–Crick base pairing. These may account for the strict double-strand dependency of the MUG proteins and, since they are absolutely specific for G, also explain their opposite G preference [21]. Thus, Mug uses the complementary base for substrate discrimination, while other UDGs establish specific contacts with the substrate base. Mutational analyses and a recently resolved crystal structure of the catalytic core of the human TDG are fully consistent with the structure–function model postulated on the basis of the E. coli Mug structure. It therefore appears that the basic catalytic mechanism can be extrapolated from Mug to human TDG and probably to the entire MUG family [26,27]. However, the Mug model fails when it comes to explain the effects of the N- and C-terminal domains on the substrate spectrum and the kinetic properties of the eukaryotic MUGs.
492
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
Table 1 – Substrate preferences of TDG orthologsa Substrateb G·U A·U ssU G·F U A·F U ssF U G·Br U A·Br U ssBr U G·Hm U G·H U G·T G·Tg G·C A·C ssC G·Hx T·Hx ssHx G·A T·A ssA G·m C G·He C G·Hp C G·G a
b
c d
hsTDGc +++ + − +++ ++ ++ +++ + − +++ +++ +++ ++ +++ ++ − + − − − − − −/+ − −
NhsTDGd +++ + − nd nd nd nd nd nd nd nd − nd nd nd nd + − nd nd nd nd nd nd nd nd
ecMugd
spThp1pc
dmThd1pc
+++ + − nd nd nd nd nd nd + ++ − nd +++ nd + nd − nd + − nd nd ++ − nd
+++ +++ +++ +++ +++ ++ ++ + + − +++ − nd +++ +++ +++ +++ +++ +++ ++ ++ + − +++ +++ +
+++ ++ − +++ ++ ++ +++ − − ++ nd ++ nd +++ ++ − + − − − − − − nd nd
Indicated are relative processing efficiencies of recombinant human full size (hsTDG) and N-terminally truncated TDG (NhsTDG) and the orthologs of E. coli (ecMug), S. pombe (spThp1p) and D. melanogaster (dmThd1p). Base release efficiencies are indicated as: +++, high; ++, intermediate; +, low; −, insignificant. Putative substrate bases are indicated in bold letters. Abbreviations used are: ss, single-stranded DNA; ds, double-stranded DNA; F U, 5-fluorouracil; Br U, 5-bromouracil; Hm U, 5-hydroxymethyluracil; H U, 5-hydroxyuracil; Tg, thymine glycol; C, 3,N4 -ethenocytosine; Hx, hypoxanthine; A, 1,N6 -ethenoadenine; m C, 5-methylcytosine; H C, 5-hydroxycytosine; He C, 3,N4 -a-hydroxyethanocytosine; Hp C, 3,N4 -ahydroxypropanocytosine; nd, not done. Fully AP-site inhibited; no enzymatic turnover. Partially AP-site inhibited; slow enzymatic turnover.
2.3.
Enzymatic properties of TDG
2.3.1.
Substrates of TDG
Several laboratories have investigated substrate processing features of MUG proteins. Some of these studies made use of specific chemical modifications at or near the target base ` (U or T) or its mispaired vis-a-vis (G) to explore mechanisms of substrate interaction and base hydrolysis by TDG. Others examined synthetic candidate base lesions with the aim to identify biologically relevant substrates [28]. We will focus our account here on a few representative substrates that illustrate the functional versatility and typical mechanistic properties of MUG proteins and implicate main lines of possible biological functions. Although the human TDG is best known for its ability to remove T from a T·G mismatch, MUG proteins of different origin were shown to have rather broad substrate spectra with U mispaired to G being the common, most efficiently processed physiological DNA lesion (Table 1) [7,8,12,29]. Derivatives of U with modifications or substituents at the 5-carbon position such as 5-hydroxy-U, 5-hydroxymethyl-U, 5-fluoroU (5-FU) and 5-bromo-U (5-BrU) turned out to be very efficiently and universally processed substrates as well [8]. By itself, this would suggest that the driving force for the evo-
lution of the MUG protein family was the potential to counter mutagenesis by deamination and/or oxidation of C. It seems, however, that these glycosylases have learned to do more than that (Table 1). They act on DNA lesions as divergent as ethenoadducts (e.g. 3,N4 -ethenocytosine) [8,30,31], deaminated purines (e.g. hypoxanthine) [8] and thymine glycol [32], and even on normal DNA bases including T and 5-meC [33]. The biochemistry of the latter, however, has remained somewhat unclear. While the chicken MUG, normally referred to as 5-meC DNA glycosylase (5-MCDG), was shown to co-purify with an appreciable 5-meC processing activity from extracts of chicken embryos, this activity is comparably low, not to say marginal, when assayed with bacterially expressed chicken 5-MCDG or human TDG [8,33,34]. This may indicate a requirement for auxiliary factors that help target the glycosylase to the methylated C and, thus, facilitate the processing of this rather suboptimal substrate. Interestingly though, the substrate spectra vary considerably between MUGs of different phylogenetic origin. Thymine and derivatives, for instance, are processed with a significant rate by the mammalian, chicken and drosophila enzymes but not by their bacterial and yeast counterparts [8,33], whereas 5-meC appears to be substrate for the vertebrate enzymes only (Table 1) [8,33,34]. Variable is also the degree of DNA
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
double-strand and mismatch-dependency. While the human TDG processes most of its substrate bases only in a mismatch with G, the fission yeast ortholog removes U from DNA also when it arises opposite A or in a single-stranded DNA context. It therefore appears that the MUG proteins have evolved with little selective pressure so that, in accordance with the specific needs of individual species, enzymes or enzyme complexes with rather distinct functionalities could develop. If we try to infer a biological function for MUG proteins from their substrate preferences in vitro, a picture as follows emerges. Besides the elimination of mutagenic bases arising by hydrolytic deamination of C (e.g. G·U), MUG proteins seem to protect more generally against DNA base deamination (e.g. hypoxanthine) and/or oxidation (5-hydroxyuracil), as well as against base modifications by products of lipid peroxidation (e.g. 3,N4 -ethenocytosine). In organisms that methylate C in their DNA, MUG orthologs have acquired an ability to additionally deal with the corresponding deamination/oxidation products (e.g. G·T, 5-hydroxymethyluracil), and possibly, to contribute to the establishment and the stabilization of genomic DNA methylation patterns (e.g. 5-meC).
2.3.2.
Mechanism of substrate interaction
Studies with non-cleavable substrate analogs provided first insight into how human TDG interacts with its substrate. In footprinting experiments, TDG protected an approximately 20 base pair stretch of DNA surrounding the mispaired U from DNaseI cleavage and made specific contacts to the N7 position of the G flanking the U at the 3 side. The latter may explain a slight preference of TDG for G·T and G·U mispairs in a CpG sequence context [35]. No contacts with the “complementary” G opposite the lesion were detected in these studies although, by inference from the crystal structure of substrate bound Mug [21,22] and the strict requirement of this base for substrate recognition, TDG is expected to establish such an interaction. This would, however, involve the Watson-Crick surface of the widowed G, which cannot be seen easily by methylation interference [35]. The ability to recognize the substrate base through specific interactions with the nucleotide in complementary position seems important for a DNA glycosylase that has the potential to attack normal DNA bases. In the case of TDG, this assures that T is recognized as a substrate only when it is mispaired with G, i.e. originates from a deaminated 5-meC, but not when it is correctly base paired with A. A mechanistic feature of the MUG proteins that, at least in part, relates to their ability to establish specific complementary base contacts, is their tight binding to the product of their reaction, the abasic site (AP-site). Human TDG, for instance, was shown to bind to an AP-site opposite G with an affinity that is higher than that to any of its preferred substrates (i.e. G·U or G·T mismatches) [26,36]. Indeed, purified full-length TDG is virtually unable to dissociate from the AP-site and is therefore fully product inhibited in base excision assays with G mispaired substrate. When the opposite base is not a G, however, slow dissociation of TDG is possible. Although the processing of such substrates is generally less efficient because of rate limitations at the level of substrate recognition, this corroborates the contribution of specific opposite base interactions to product inhibition.
493
Some degree of AP-site inhibition is common to most DNA glycosylases (e.g. [37–41]), but the extent observed with TDG is truly exceptional, even by comparison with other members of the MUG family. The E. coli Mug protein, for instance, turns over on a G·U substrate, albeit slowly [12], implicating that the opposite G interactions cannot fully account for the inability of TDG to dissociate from the AP-site. Indeed, deletion mutagenesis showed that truncation of the N-terminus converts TDG into Mug-like enzyme that processes G·U with a slow turnover, but fails to act on a G·T substrate with an appreciable efficiency [7,13]. Concurrently, deletion of the N-terminus abolishes a non-specific homoduplex DNA binding activity of TDG, which, compared to Mug and other DNA glycosylases, is quite appreciable [13,18,26]. This could all be explained by the presence of HMGA (HMGI/Y)-box-like sequences in the N-terminal domain of TDG (Fig. 2). HMGA-boxes are frequently associated with nuclear proteins and have been shown to act as auxiliary structural motifs to provide non-specific DNA binding functionality [42,43]. Consistently, the Drosophila Thd1, which also processes G·T mispairs, contains related AT-hook motifs in its N- and C-terminal domains, whereas the fission yeast and the bacterial orthologs that fail on G·T substrates are devoid of such sequence motifs (Fig. 2). These correlations suggest that the non-conserved N-terminus of TDG has evolved to allow non-specific DNA binding, facilitating the processing of energetically suboptimal substrates such as G·T or G·5meC at the cost of free enzymatic turnover. Unfortunately, despite some considerable efforts, full-length TDG has eluded three-dimensional-structural analyses so that we do not know exactly how the N-terminal domain cooperates with the catalytic domain in DNA interaction. Analyses of partial tryptic digests of free and DNA bound TDG, however, have shed some light onto the problem [13]. Such experiments implicated that, upon encountering DNA, TDG undergoes a dramatic conformational change that depends on the pres¨ ence of and involves the N-terminal domain (Primo Schar, Roland Steinacher, unpublished results). Consistently, crystallographic data revealed that E. coli Mug, which lacks a comparable N-terminus, undergoes only a minor structural rearrangement when it binds to DNA [22]. Thus, the experimental evidence available supports a mechanistic model in which the N-terminal domain of TDG forms a flexible “clamp” that holds the glycosylase onto the DNA. In this state, TDG may slide along the DNA in search of a G mismatched substrate. At substrate recognition and target base flipping, residues of the catalytic pocket establish the specific hydrogen-bonding interactions with the widowed ` G vis-a-vis. Following base release, these G contacts and the non-specific DNA contacts mediated by the N-terminus then cooperate to prevent free dissociation of TDG from the APsite. Such a model would predict the need for a release factor that stimulates the displacement of TDG so that BER can proceed. A possible candidate would be the BER enzyme acting downstream of TDG, i.e. the (AP)-endonuclease (APE1). Indeed, experiments with purified human proteins showed that APE1 is able to stimulate the turnover of TDG on a G·T substrate [29]. However, the requirement of a high molar excess of APE1 and the fact that any other AP-site interacting protein tested has a similar stimulatory impact on TDG turnover implicates that
494
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
these are passive rather than active and specific effects (Primo ¨ Ulrike Hardeland, unpublished data), as proposed also Schar, for the APE1 mediated stimulation of other DNA glycosylases (e.g. [44]).
2.4.
TDG and SUMO
A search for proteins interacting with human TDG led to the isolation of Small Ubiquitin like Modifiers (SUMOs) [18]. SUMOs are small polypeptides structurally related to ubiquitin that interact with and/or are attached to other proteins. Mechanistically, covalent SUMO modification (SUMOylation) is similar to ubiquitylation but requires its own set of E1, E2 and E3 conjugating enzymes [45]. Most of the targets of SUMOylation appear to be nuclear proteins with diverse functions [46]. Among them are proteins involved in DNA replication and repair as well as mediators of chromosome structure and dynamics, implicating a prominent role of SUMO modification in genome maintenance and stability [47]. The functional consequences of SUMO conjugation are still poorly understood, although an emerging theme is that it induces conformational rearrangements in target proteins in a way that alters their molecular interactions properties. Hence, depending on the target, SUMO-dependent changes in intra- or intermolecular interactions may affect protein localization, stability and enzymatic activity. Not only was TDG found to interact with, but also to be modified by SUMO-1 and SUMO-3. SUMO conjugation involves lysine 330 (K330) located in a C-terminal SUMOylation consensus motif (VKEE) (Fig. 2), it is ATP-dependent and, when performed in cell extracts, stimulated by the presence of DNA. SUMO attachment to K330 affects structural and enzymatic properties of TDG. The modified glycosylase is not longer able to interact with free SUMO or SUMO-conjugated proteins ¨ Roland Steinacher, unpublished data), or to bind (Primo Schar, stably to AP-sites or any other DNA. Yet, it processes a G·U substrate with enhanced efficiency due to an induced enzymatic turnover but, at the same time, loses its ability to hydrolyze T from a G·T substrate. Obviously, SUMO modification alters the way TDG interacts with DNA so that AP-site dissociation and, hence, a slow turnover on an energetically favourable substrate (G·U) becomes possible, but processing of a suboptimal substrate that requires tight DNA interactions (G·T) is less efficient [13,18,26]. Thus, SUMO modification in the C-terminus converts TDG to an enzyme with Mug-like properties, as does the deletion of the N-terminus. As it turned out, this is not pure coincidence. A systematic assessment of the enzyme kinetic effects of SUMOylation on different domain truncation variants of TDG revealed that SUMO conjugation to full-length TDG and deletion of the N-terminus affect the same underlying mechanism of AP-site product inhibition. The results suggested that SUMOylation neutralizes the non-specific DNA binding activity of the N-terminal domain of TDG. The DNA-dependent conformational change of the N-terminus, i.e. the DNA clamp formation, was not seen when TDG was SUMOylated [13]. Hence, the same as SUMO modification prevented TDG from assuming a DNA binding conformation in these experiments, it may induce the opening of the “clamp” when it is conjugated to DNA bound TDG.
Other work addressed the non-covalent interaction of TDG with SUMO [18]. Mutational analyses identified a C-terminal SUMO-interaction motif (residues 304–316) that is distinct from the site of covalent attachment but appears to be essential for SUMO conjugation in vivo [48]. A putative second SUMOinteraction site is discernible in the N-terminal part of TDG, where the sequence “I134 V135 I136 I137 ” matches an experimentally deciphered SUMO-interaction consensus [49]. An interaction of SUMO with these or nearby amino-acids might have considerable functional consequences as they are part of the conserved glycosylase active site and also overlap with a motif that mediates interaction with the estrogen receptor ␣ [19]. Whether or not TDG uses these interfaces to interact with free SUMO or with SUMO-modified proteins in the context of its biological function is currently unknown. The interactions of TDG with SUMO were ultimately visualized in a crystal structure. Although these analyses were done with a truncated TDG (residues 112–339), lacking the N-terminus and a large part of the C-terminal domain, the structure showed that upon attachment to K330, the TDG–SUMO conjugate assumes a highly organized configuration that builds on specific intermolecular contacts between the two protein moieties. These contacts engage residues of the C-terminal, non-covalent SUMO-interaction motif of TDG, which forms a -strand that wraps around a -strand of the SUMO to form an intermolecular antiparallel B-sheet structure. This conformation is further stabilized by extensive polar and hydrophobic contacts between residues of the intertwined B-strands [27]. Thus, consistent with our finding that SUMOylated TDG loses its ability to interact with nonconjugated SUMO, the structure shows that the C-terminal SUMO-interaction site of TDG is fully occupied when SUMO is attached and therefore no longer free for further non-covalent binding of SUMO. As a consequence of SUMO conjugation, an ␣-helical peptide of TDG, containing the SUMO attachment site, forms a protrusion on the surface of the TDG–SUMO complex. From structural modeling, it was concluded that the protruding ␣helix would interfere with DNA binding. Consistently, SUMO conjugation to the truncated TDG used in this study slightly reduced AP-site binding [27]. Yet, because these data were obtained with a N-terminal truncation of TDG, this cannot be the mechanism to account for the SUMOylation-induced AP-site dissociation observed with full-length TDG, where the tight interaction with the AP-site results from cooperative DNA binding by the N-terminal domain (unspecific DNA clamp) and the glycosylase active site (specific opposite G contacts) (see Section 2.3.2); a TDG lacking the non-specific DNA binding function dissociates from AP-sites regardless whether SUMO is attached or not [13]. Hence, while the structure provided valuable insight into the architecture of the TDG–SUMO conjugate, further studies with full-length TDG will be necessary for an ultimate resolution of the mechanism of SUMOinduced AP-site dissociation. Taken together, biochemical and structural evidence supports the concept that SUMOs interact covalently and noncovalently with TDG and thereby induce changes in protein conformation that are required for its functionality, in particular, for its release from the AP-site. SUMOylation can thus be considered an integral regulatory component of TDG medi-
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
ated BER. If so, the apparent inconsistency that SUMOylation enhances G·U processing while abolishing G·T processing must be reconciled in a mechanistic model for TDG function in cells. This is possible if we appreciate first, that SUMOylation is a highly dynamic, i.e. reversible, protein modification, and second, that the majority of TDG protein in cells is SUMO-free, i.e. competent to recognize and process the full range of its substrates. Given this, a model can be postulated that invokes SUMOylation as a temporary TDG modification, affecting only DNA bound TDG and allowing the glycosylase to dissociate from the product AP-site following base release, so that BER can proceed. Dissociated TDG will then be readily de-modified by a SUMO-specific isopeptidase [50] to make it available for de novo recognition of G·U and G·T mispairs that might be generated [18].
3.
Biological functions of TDG
3.1.
DNA repair
Given the presence of MUG proteins in species across a broad phylogeny, one might conclude that their function is of fundamental biological importance and has therefore been conserved during evolution. Initially discovered in HeLa cells as an activity that catalyzes the excision of U and T mispaired with a G, the human TDG was proposed to act against mutation when Cs and 5-meCs deaminate. However, later studies, comparing the enzymatic properties of MUGs of different species, identified a number of well-processed DNA substrates that are not generated by deamination of C or 5-meC. These include a range of oxidized pyrimidines as well as some rather bulky ethenoadducts and damaged purine bases (see Section 2.3.1). Hence, it appears that MUG proteins have a more general function in the repair of DNA base damage than originally postulated. It is interesting though that different MUGs have different substrate spectra, suggesting that their function has not been strictly conserved in evolution (Table 1). With G·U being the common most efficiently processed substrate, the MUGs most likely originated from an ancestral uracil processing activity but then diverged to specialized enzymes to suit specific needs of the respective hosts—hence the evolution of non-conserved N- and C-terminal domains around a conserved catalytic core (see Section 2.3.1, Fig. 2). This is perhaps best exemplified by the G·T substrate. The potential to process this substrate appears to correlate with the degree of cytosine methylation in the genomes of the host organisms; it is highest for the mammalian TDGs where 5% of bases are methylated, poor for the Drosophila Thd1 where less than 1% of bases are methylated, and absent from fission yeast Thp1 where cytosine methylation is undetectable [51–53]. Thus, G·T processing might represent an extra feature of the mammalian enzymes that helps avoid genetic instability at sites of cytosine methylation. An assessment of the biological function of TDG must therefore be done from a perspective of individual substrates.
3.1.1.
Repair of G·T mismatches
Considering a role of TDG in the restoration of methylated G·C pairs following 5-meC deamination, a second DNA glycosylase
495
with G·T processing ability present in vertebrate cells must be taken into account. This enzyme, called MBD4/MED1, belongs to the family of methyl-CpG-binding domain (MBD) proteins and consists of an N-terminal MBD domain that is linked to a C-terminal DNA glycosylase domain. Although structurally unrelated, MBD4/MED1 has enzymatic properties very similar to those of TDG; it releases T and U from G·T and G·U mismatches, respectively, and processes a number of other substrates in common with TDG [54,55]. To what degree any of these glycosylases contributes to G·T processing in living cells is uncertain. Disruption of MBD4/MED1 in mouse causes a small increase in C → T mutations at CpG sites, which would be consistent with a defect in the repair of deaminated 5-meC that cannot be fully compensated for by the presence of TDG [56]. On the other hand, we found that inactivation of TDG in mouse embryonic stem cells and fibroblasts reduces G·T processing in cell extracts below detection, indicating that TDG provides the predominant activity against the products of 5mC deamination in these cells (Christophe Kunz, Yusuke Saito, ¨ manuscript in preparation). In the light of this Primo Schar, apparent discrepancy, the only firm conclusion that can be drawn at this point is that the G·T repair capacity in vertebrate cells is provided by at least two distinct DNA glycosylases that may act in a partially redundant manner. In the light of such powerful defense, however, it seems surprising that methylated CpGs are mutation hotspots in the mammalian genome, and that C → T transitions at such sites are often seen in the DNA of human cancer cells [57]. One could argue that, given the rather inefficient processing of G·T substrates by TDG and MBD4/MED1, the number of substrates generated by deamination may exceed the repair capacity of the cell. G·T mispairs escaping repair would then give rise to C → T transition mutations upon DNA replication. Alternatively, the postreplicative mismatch repair system (MMR) might gain access to such G·T mispairs occasionally and, not being able to discriminate mutant from wild-type sequence in the context of non-replicating DNA, erroneously process the G strand and, hence, fix the mutation. Another hypothesis worth considering is that the G·T glycosylases do not act globally in the genome but are targeted to specific sites. Indeed, TDG was reported to interact with transcription factors and to co-regulate gene expression at specific promoters (see Section 3.2). Similarly, MBD4/MED1 was shown to repress transcription of a reporter gene controlled by hypermethylated p16INK4a and hMLH1 promoters [58]. These findings are consistent with G·T repair by TDG and MBD4/MED1 being restricted to specific areas of the genome and associated with defined physiological processes that involve the activation or inactivation of genes. Consequently, some areas of the genome would be more susceptible to mutagenesis through 5-meC deamination, whereas other would be safeguarded by TDG and/or MBD4/MED1. Finally, while all these explanations seem logical from a mechanistic point of view, there is one more possibility to be considered, one that challenges the generally accepted dogma that deamination of 5-meC is the predominant cause of the decline of CpG dinucleotides in our genomes. There is growing evidence for methylated CpGs not only being hyper-susceptible to deamination but also to various forms of endogenous and environmental genotoxic stress that can
496
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
give rise to mutagenic lesions that are not substrate for the G·T glycosylases [59]. Preferential mutagenesis at methylated CpGs through pathways that do not or not exclusively involve 5-meC deamination could thus resolve the dilema why mutations occur at these sites despite the presence of the G·T repair enzymes TDG and MBD4/MED1.
3.1.2.
Repair of mismatched uracil
Cytosines suffer hydrolytic deamination at a three- to fourfold lower rate but, on a genomic scale, still more frequently than 5-meC. In double-stranded DNA, this generates a U mispaired with a G. Unless repaired, the U will pair with A during DNA synthesis and, thus, give rise to a C → T transitions. Alternatively, U can also arise in DNA through dUMP incorporation during replication, in which case it is base paired with A and, hence, non-mutagenic. The biochemical evidence would predict TDG to act efficiently on G·U mispairs but hardly on an A·U base pair (Table 1). Considering TDG function in the cellular context, however, we are again confronted with a complex situation of redundancy. In a mammalian cell nucleus, TDG finds itself in good company with at least three additional UDGs; i.e. the very potent uracil-DNA glycosylase (UNG2), the MBD4/MED1 protein, and the “single-strand selective monofunctional uracil-DNA glycosylase (SMUG1)”. Certainly, these enzymes have not evolved side by side just to back-up each other in U excision; there must be specific functional niches for all of them, which remain to be identified. Experimental evidence showing UNG2 interacting with replication proteins and localizing to replication foci suggests that this UDG is specialized for the rapid removal of dUMPs that happen to be misincorporated during DNA replication [60]. This is consistent with the phenotype of UNG deficient cells. Despite the presence of TDG, MBD4/MED1 and SMUG1, these accumulate significant amounts of dUMP in their DNA [61]. This, in keeping with the poor activity on the A·U substrate, would argue against a significant contribution of TDG to the elimination of U that gets misincorporated opposite A. A direct replication associated function of TDG is further excluded by the fact that the protein is actively degraded by the proteasome pathway at the G1/S boundary of the cell cycle and then remains undetectable during the entire S-phase (Ulrike Hardeland et al., manuscript submitted). Unlike mammalian TDG, however, Thp1 of S. pombe releases U from A·U base pairs [8] and is not cell cycle regulated. Consistently, a fission yeast ung1 mutant does not accumulate significant amounts of dUMP in its genome, unless Thp1 is genetically inactivated as well (Marc Bentele et al., manuscript submitted). Thus, in fission yeast, unlike in mammalian cells, the replicative uracil-DNA glycosylase Ung1 acts synergistically with the TDG ortholog to eliminate U that gets misincorporated during DNA replication. Considering the repair of mutagenic uracil that arises from cytosine deamination, the fact that inactivation of UNG in mouse did not significantly alter the mutation frequency in the Big Blue assay argues for G·U correction being achieved by redundant activities. UNG2 and SMUG1 are good candidates as the C → T transition frequency at the hprt locus of Ung deficient mouse cells increases synergistically when Smug1 is silenced by siRNA [62]. However, TDG is likely to act on deaminated cytosine as well since it is highly active on a G·U substrate and is expressed in most mammalian cell types. A
role of TDG in G·U processing is also evident from genetic data obtained from the S. pombe model, where the concurrent inactivation of thp1 and ung1 increases the C → T transition rate synergistically (Marc Bentele at al., manuscript submitted). Surely, with four enzymes competing for uracil excision in mammalian cells, the situation is more complex and implies some form of functional separation. Separation could be temporal and/or spatial, as indicated by the cell cycle regulation of TDG and/or the interaction of TDG and MBD4/MED1 with transcription factors. Hence, whereas UNG and SMUG1 may have more global genome repair activity, the G·U processing of TDG (and MBD4) may be confined to certain areas of the genome and/or to specific physiological states of the cells. One example of localized generation of G·U mispairs is the AID (activation-induced cytidine deaminase) catalyzed cytosine deamination that occurs when B-cells of the immune system induce somatic hypermutation (SHM) and class switch recombination (CSR) in immunoglobulin genes [63]. Mice with a defect in Ung have a reduced frequency of AID-induced transversion mutations, implicating a defect in the removal of U, as well as low IgG serum levels, implicating inefficient CSR [64]. Therefore, Ung2 was proposed to act on AID generated G·U mispairs in the process of SHM and CSR [63]. According to this model, AP-sites generated by Ung2 either give rise to point-mutations (SHM) when translesion polymerases synthesize across, or induce recombination when converted to single-strand breaks by the action of an AP-endonuclease. However, neither transversion mutagenesis nor CSR is fully defective in an Ung deficient background, indicating a contribution of other UDGs. In principle, these could be attributed to Smug1, Tdg and/or Mbd4. Smug1 overexpression was indeed found to partially complement the SHM and CSR defect of msh2−/− ung2−/− cells, but an involvement of this protein in antibody diversification is questionable as it is downregulated following B-cell activation [65]. By contrast, TDG is relatively abundant in B-cells and appears to be upregulated upon B-cell ¨ unpubactivation in vitro (Christophe Kunz and Primo Schar, lished results). It may therefore take part in SHM and CSR. From a mechanistic point of view, one might argue that TDG is the enzyme optimally suited to induce transversion mutations following AID catalyzed cytosine deamination. Unlike Ung2, which is a high turnover enzyme designed for rapid and complete repair of U, TDG is slow because it remains bound to the AP-site until it is actively induced to dissociate (see Sections 2.3 and 2.4). This possibility of delaying the processing of the AP-site seems desirable in a condition where a non-instructive lesion is generated for the purpose of “instructing” a mutation. So, despite the undisputable effect of an Ung defect on SMH and CSR, TDG must be considered a candidate glycosylase for the generation of AID-induced mutation and recombination.
3.1.3.
Repair of other forms of base damage
TDG has a broad range of substrates that includes oxidation, alkylation and deamination products of C, 5-meC, T and A (see Section 2.3.1 [8,32]), implicating a rather general function in the repair of DNA base damage. Corroborating genetic evidence, however, is still missing with one notable exception, the repair of 3,N4 -ethenocytosine (C) in E. coli. DNA ethenoadducts including C arise by reaction of DNA with metabolic products of carcinogens such as vinyl chloride or
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
through membrane lipid peroxidation [66,67]. Levels up to 28 adducts per 107 bases have been detected in the DNA of various mammalian tissues [68,69]. C has a mutagenic potential and produces most frequently C → A transversions and C → T transitions, and it was shown to be a reasonably good substrate for MUG proteins, including TDG [8,70]. In E. coli, Mug appears to be the only enzyme capable of excising C from the DNA. Consistently, Mug deficient strains display hypersensitivity to C [71], a phenotype that can be complemented by human TDG [72]. In mammalian cells, SMUG1 and MBD4 also process C [73,74], but compared to TDG, their activity seems relatively weak. Hence, TDG might constitute the main repair activity protecting cells from C-induced mutagenesis.
3.2.
Regulation of gene expression
Already in 1992, Chevray and Nathans published a physical interaction of mouse TDG with the transcription factor cJun [14]. This was the first of a number of reports by several laboratories of physical and functional interactions of TDG with various transcription factors that, altogether argued for a role of the glycosylase in the regulation of gene expression. Pierre Chambon’ s laboratory found TDG to interact with the nuclear receptors RAR (retinoic acid receptor) and RXR (retinoid X receptor) [15]. RAR and RXR form dimeric complexes that bind retinoic acid response elements (RAREs) to regulate gene activity in a ligand-dependent manner [75]. TDG interacts with RAR and RXR through its central catalytic domain (residues 122–346) in a ligand-independent manner (Fig. 2). This enhances the binding of the receptor complexes to RARE containing DNA substrate in vitro and potentiates transactivation of reporter genes in co-transfection experiments [15]. An active site mutant of TDG (N140A), however, failed to stimulate RAR/RXR-mediated transcription significantly above background, suggesting an involvement of the ¨ unpubglycosylase function (Ulrike Hardeland, Primo Schar, lished data). Estrogen receptor ␣ (ER␣), mediating estrogen responses, is another member of the nuclear receptor family that was shown to associate physically and functionally with TDG [19]. The interaction is established through the ligand-binding domain of the receptor (LBD/AF2) and involves residues 116–146 of human TDG (Fig. 2). This sequence contains a putative ␣-helical motif related to the LXXLL signature that is known to mediate interactions with nuclear receptors. Chromatin immunoprecipitation (ChIP) confirmed that TDG is indeed recruited to estrogen-responsive promoters in MCF7 cells exposed to estradiol (E2), presumably through its interaction with ER␣. Notably, in this case, the glycosylase function of TDG is dispensable for transcriptional co-activation; in transient co-transfection experiments the catalytic mutant (N140A) co-activated an ER␣ responsive reporter gene as efficiently as the wild-type protein. This stands in contrast to RAR/RXR-dependent transcription, where TDG seems to have an active role. One possible explanation for this apparent discrepancy is that TDG has both, DNA glycosylase and scaffold functions in gene regulatory protein complexes, which may be differentially required in different biological contexts, e.g. at methylated versus non-methylated gene promoters (see Section 3.3).
497
Moreover, TDG was found to associate with SRC1, a p160 co-activator of ER␣ [20]. Here, the interaction involves regions containing tyrosine-repeat motifs located between residues 334–346 of human TDG and residues 989–1240 of human SRC1 (Fig. 2). ChIP experiments indicated that TDG and SRC1 are both recruited to estrogen-responsive gene promoters in E2 stimulated cells, presumably by ER␣. The TDG-SRC1 complex appears to activate ER␣-mediated gene expression in the absence of estrogen, suggesting cooperative functional interactions between the two co-activators. Tini et al. demonstrated an interaction of TDG with yet another type of transcriptional co-activator, the CREB binding protein (CPB) and its paralog p300 [17]. Both play an important role in RNA polymerase II-mediated gene transcription, have histone acetyltansferase (HAT) activity and interact with various other transcription factors. Through acetylation of histone tails, CBP/p300 is thought to induce changes in chromatin structure that make promoter regions accessible for transcription factor binding [76]. Interactions with TDG occur through the HAT and CH3 domains of CBP/p300 and involve both, its N- and the C-terminal domains (Fig. 2). The resulting CBP–TDG complex binds DNA, processes G·T and G·U mismatches, is competent for histone acetylation in vitro, and enhances CBP-activated transcription of a reporter gene in transient co-transfection experiments. All this implicates a functional interaction. As in the case of the TDG–ER␣ complex, however, the glycosylase activity is dispensable for the stimulation of CBP-activated transcription. The interaction with CBP/p300 also leads to the acetylation of lysine residues in the hydrophobic N-terminal region of TDG. The function of this modification is not entirely clear; it does not seem to affect the enzymatic activity of the glycosylase (Ulrike Hardeland, ¨ unpublished data) but reduces the stability of Primo Schar, a ternary TDG–CBP–DNA complex and prevents a DNA mediated interaction with APE1. It was therefore proposed that TDG acetylation may have a regulatory role in the context of chromatin remodeling, gene regulation and DNA repair. In one case, TDG was reported to act as a repressor of transcription. This is when it interacts with the thyroid transcription factor 1 (TTF1), a member of Nkx2 family of homeodomain proteins that is essential for the embryonal differentiation of the thyroid, lung and brain, as well as for thyroid and lungspecific gene expression in adult tissue [16]. Rat TDG was found to repress TTF1-activated transcription in thyroid and non-thyroid cells in transient co-transfection experiments. Whether or not this repressor function requires the glycosylase activity has not been resolved. In addition to the interactions described above, TDG was seen to associate with several other transcription factors of the nuclear receptor family, including androgen receptor (AR), glucocorticoid receptor (GR), progesterone receptor (PR), peroxisome proliferators-activated receptor ␣ (PPAR␣), thyroid hormone receptor a (TRa) and Vitamin D3 receptor (VDR) [19]. Although the biological significance of these interactions is unclear, these findings suggest that a cooperation of TDG with transcriptional activators is the rule rather than the exception. Given all these interactions with transcription factors, the question arises whether and how the DNA glycosylase activity of TDG can be reconciled with a role in gene regulation in a plausible functional concept. This is possible if we con-
498
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
sider that, in vertebrate genomes, cytosine methylation in CpG dinucleotides is an important feature of gene regulation that can be corrupted by base deamination. TDG might be responsible for initiating correction of G·T mismatches that arise at methylated CpGs or, equally, of G·U mismatches at non-methylated CpGs. Thus, the specific recruitment to gene regulatory elements through the interaction with transcription factors would allow TDG to interrogate the integrity of such sequences, including the CpG islands found in promoters of many vertebrate genes. Transcription factors, on the other hand, could play a role in region-specific DNA repair by ‘sensing’ DNA damage in actively expressed areas of the genome through their ability to recruit DNA repair enzymes like TDG. Methylation of CpG dinucleotides at gene promoters is often associated with gene silencing and is a key epigenetic regulator of gene expression. It may be that, in some circumstances, the reversal of such methylation is necessary for the re-activation of genes. The mechanism of such demethylation is unclear, but it has been suggested that the process could involve BER, starting with the excision of 5-meC by a 5methylcytosine DNA glycosylase (5-MCDG) that may turn out to be TDG (see Section 3.3).
3.3.
TDG and CpG (de)methylation
The possible association of TDG with the regulation of DNA cytosine methylation has raised a vivid (scientific) debate. Because this is an important issue, this chapter is dedicated to a critical assessment of the relevant experimental evidence. DNA methylation in vertebrates occurs at the 5 position of Cs immediately followed by G, affecting about 60–90% of all CpG dinucleotides depending on the species and/or the tissue examined. Being an epigenetic modification, cytosine methylation is essential for genomic imprinting and X-chromosome inactivation, but also affects gene expression and genomic stability. CpG methylation in vertebrate genomes is largely laid down during embryogenesis. There, dynamic changes in DNA methylation and histone modifications contribute critically to the establishment of cell-type- and tissue-specific gene expression patterns, which are a prerequisite for a coordinated development of the fetus [77,78]. Hence, a tight control of DNA methylation and demethylation processes during embryogenesis is of vital importance. Aberrant DNA methylation in adult tissue has been associated with aging and various human diseases including imprinting disorders and cancer [79]. While the enzymology of cytosine methylation by DNA methyltransferases is reasonably well established [80], mechanisms of active demethylation are largely obscure. Such a function might be required to enforce fidelity on the methylation process or to reactivate silenced genes. Human MBD2b was associated with an enzymatic activity that has the power to remove the methyl-group from 5-meC [81], but the reproducibility of this finding has been questioned since. JeanPierre Jost’ s laboratory, on the other hand, found a quite different activity in nuclear extracts from developing chicken embryos and from differentiating mouse G8 myoblasts. This activity promoted “demethylation” of 5-meC in a hemimethylated DNA substrate by a process implicating excision repair [82,83]. The purification of this activity from extracts of chicken embryos led to the isolation of a 5-meC DNA glycosy-
lase (5-MCDG) [84], which also processed T in G·T mismatches and eventually turned out to be the chicken ortholog of TDG [33]. However, when produced as a recombinant protein in E. coli, the 5-MCDG/TDG processed 5-meC with an extremely poor efficiency [33], suggesting that the glycosylase on its own cannot constitute a physiologically relevant demethylation activity. Other studies by the same group showed that the 5-meC and G·T mismatch specific DNA glycosylase activities isolated from chicken embryos were sensitive to RNAse digestion, suggesting an involvement of an RNA component [85,86]. Indeed, heterogeneous and CpG rich RNA could be recovered from purified chicken 5-MCDG. Strikingly, when made complementary to the methylated strand of a hemimethylated DNA substrate, synthetic RNAs were able to restore 5-meC DNA glycosylase activity to a previously RNAse treated preparation [86]. It therefore appears that the RNA is an integral part of the glycosylase that may facilitate the targeting of the enzyme to sites where demethylation is needed. Still, some discrepancies in the published experimental evidence remain to be resolved before a firm conclusion about the function of this RNA in 5-meC demethylation can be drawn. Besides the RNA, the purified native 5-meC DNA glycosylase activity also contained a DEAD box protein related to the mammalian p68 RNA helicase [87]. Exactly how this RNA helicase contributes to demethylation is unclear. Yet, according to a model [87], the p68 RNA helicase might be responsible for the rearrangement of the secondary structure of the CpG-rich RNA components to make them suitable for targeting the DNA glycosylase to the site that is to be kept free of methylation. To assess the relevance of 5-MCDG/TDG for active demethylation in living cells, the Jost group expressed human TDG in human embryonic kidney cells and examined the inducibility of a stably integrated reporter gene controlled by an ecdysone-retinoic acid responsive enhancer-promoter element. Overexpression of 5-MCDG/TDG resulted in the specific demethylation of CpG sites downstream of the hormone response elements in the promoter–enhancer region [34]. Since no genome-wide demethylation was observed, this would fit with the idea that transcription factors target the glycosylase to sites that need be demethylated or protected from de novo methylation. Another study, however, addressing 5-meCpG demethylation during mouse myoblast differentiation led the same authors to conclude that 5-MCDG/TDG contributes to global demethylation [88]. The work by the Jost laboratory showed that an excision repair process is associated with the removal of 5-meC from DNA. A concept whereby a DNA glycosylase initiates this process seems plausible, and the enzyme in question could indeed be 5-MCDG/TDG. Yet, can DNA excision repair indeed be regarded an appropriate strategy for global demethylation as it is seen during embryonic development? The answer is: “We don’t know”. Certain is, tough, that if global demethylation occurred by BER, a high degree of coordination would have to be involved to assure that mutation, rearrangement and fragmentation of the genome through excessive repair is avoided. By its ability to bind and protect AP-sites until it gets SUMOylated, TDG would provide for such coordination at a critical step of the excision repair process. However, considering its poor affinity for 5-meC·G base pairs, its cat-
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
alytic inefficiency on this substrate and its slow turnover, TDG mediated BER seems a highly unproductive and, thus, risky way of global demethylation, and is therefore unlikely a realistic scenario. More likely is that TDG contributes to site-specific stability of CpG methylation. Targeted by auxiliary proteins, i.e. transcription factors, to such sites, it might demethylate 5-meCpGs in gene promoters upon gene activation and/or protect unmethylated CpGs in promoters of active genes from accidental de novo methylation. This would be consistent with the co-purification of 5-meC glycosylase activity in 5-MCDG/TDG containing fractions of nuclear extract of developing embryos [33,89]. Whether or not 5-MCDG/TDG is a genuine 5-meC DNA glycosylase is still uncertain. The activity measured in partially purified preparations of 5-MCDG/TDG may result form the concerted action of the glycosylase with auxiliary factors [86,87]. The possible functions of such factors are equally unclear. They may help targeting the glycosylase to the methylated cytosine; they may facilitate the disruption of the hydrogen bonds of the 5-meC·G base pair so that the methylated C can be accommodated in the active site pocket of the glycosylase; or they may promote enzymatic deamination of 5-meC to T [90], which would then be a better substrate for TDG. All this would be consistent with purified recombinant 5-MCDG/TDG being very inefficient, if not impotent, in processing of 5-meC [8,33]. Thus, given the right conditions and physiological environment, TDG may act as a 5-MCDG and contribute to demethylation of 5-meC at specific sites in the genome, but this remains to be confirmed.
3.4.
TDG in embryonic development
TDG cooperates with transcription factors that are essential for developmental processes (e.g. [91–93]). Embryonic development, on the other hand, is known to be associated with dynamic changes in CpG methylation [78], which, at least in certain areas of the genome, correlates with gene activation or inactivation [94]. Does this implicate TDG in developmental gene regulation? Tdg is readily detectable and highly active in mouse embryonic stem (ES) cells, and the levels even increase when these cells are induced to differentiate in vitro (Yusuke ¨ in preparation). In the mouse embryo itself, Saito, Primo Schar, Tdg-specific mRNA is seen to distribute ubiquitously and uniformly across the fetus from days 7.5 to 13.5 post-coitum. Later (at day 14.5), the mRNA is enriched in certain tissues including the developing nervous system, thymus, lung, liver, kidney and the intestine [95]. Whatever the role of TDG in embryogenesis may be, it is essential for proper development of the fetus, as homozygous Tdg null-embryos lose viability at mid¨ et al.; Tetsuya Ono et al., manuscripts gestation (Primo Schar in preparation). Given that Ogg1, Nth1, Mpg, Ung, and Mbd4 are all dispensable for embryogenesis [56,61,96–98], this is a rather unusual phenotype for a DNA glycosylase defect. It must therefore be concluded that TDG has a non-redundant essential developmental function that may relate mechanistically to BER but is distinct from the simple elimination of damaged DNA bases. It is tempting to speculate that the TDG defect affects gene expression controlled by its interaction partners RAR/RXR, CBP/p300, c-Jun, and others that have an essential role in embryonic development. The resulting imbalance in gene
499
expression may then disrupt the developmental program and cause the embryo to die. This function in co-regulation of gene transcription may relate to Tdg’s ability to process 5meC [33], as an imbalance in CpG methyltransferase activities in embryos [99] also causes dysregulation of gene expression and lethality [100–102]. It may be that Tdg, in conjunction with specific targeting factors, contributes to the establishment and/or the maintenance of proper CpG methylation patterns in certain regions of the genome and thereby assures accurate control of gene expression. At this point, however, this is little more than an interesting hypothesis that is worth being tested.
3.5.
TDG and cancer
3.5.1.
TDG and carcinogenesis
All considered, TDG could contribute to tumor suppression in a number of different ways. It may help maintain genomic stability through the repair of mutagenic DNA base damage (e.g. deaminated C or 5-meC); it may provide epigenetic stability through the excision of erroneously methylated Cs in gene regulatory sequences; and/or it may assure proper cell differentiation and, thus, control the number of stem cells and/or tumor progenitor cells in certain tissues by its ability to cooperate with nuclear receptors and other transcription factors that integrate differentiation signals. CpG dinucleotides, most of which are methylated in vertebrate genomes, are indeed hotspots for mutations, and correlations between CpG mutagenesis and cancer development have long been established. Approximately 25% of all cancer associated mutations in the p53 tumor suppressor gene are C → T transitions located at CpG sites; in colon and gastric cancer, this proportion rises to about 50% [57]. Although human TDG is expressed in most, if not all, tumor relevant tissues, its contribution to the avoidance of such mutations is speculative. The gene was mapped to a chromosomal region (12q24.1) that is frequently affected by loss of heterozygosity in gastric cancers, but inactivating mutations in TDG have not yet been identified in such tissue [103,104]. However, the number of tumors or cancer cell lines screened so far are too small to allow firm conclusions regarding the role of TDG defects in carcinogenesis. Considering its role in controlling gene expression, one might also expect dominant, i.e. oncogenic, effects of TDG defects. In this regards, it is interesting that Tdg expression levels were found increased in mammary gland tumors that developed in HA-ras or c-myc transgenic mice, or in osteosarcomas that arose in p53 heterozygous mice [95]. Be it as a tumor suppressor or as an oncogene, an involvement of TDG in carcinogenesis remains to be established.
3.5.2.
TDG and cancer therapy
5-FU is an antimetabolite used in chemotherapy against a wide range of human cancers. Within cells 5-FU is converted to three main active metabolites; fluorodeoxyuridine monophosphate (FdUMP), fluorodeoxyuridine triphosphate (FdUTP), and fluorouridine triphosphate (FUTP). These interfere with DNA and RNA metabolism. FdUMP inhibits thymidylate synthase (TS), a consequence of which is that dUTP levels increase at the expense of dTTP. This imbalance in the nucleotide pool
500
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
gives rises to increased misincorporation of dUMP into the DNA (U·A base pairs). This, together with the direct incorporation of FdUTP (5-FU·A base pairs), is thought to account for the DNA directed cytotoxicity of 5-FU, although the underlying mechanism has remained obscure [105]. 5-FU is an excellent substrate for the MUGs, irrespective of whether it is paired with G or A (Table 1) [8]. These, however, do not seem to provide 5-FU resistance to cells as one might expect, they rather kill; a fission yeast thp1− mutant is significantly hyperresistant to 5-FU treatment (Marc Bentele et al., manuscript submitted). Similarly, inactivation of TDG in mouse embryonic fibroblasts causes hyperresistance to moderate doses of 5-FU. This hyperresistance correlates with a decrease in 5-FU-induced DNA single- and double-strand breaks and the loss of activation of an intra S-phase DNA damage checkpoint (Christophe ¨ manuscript in preparation). Thus, TDG Kunz, Primo Schar, contributes significantly to the DNA directed cytotoxicity of 5-FU and, considering that A·U is a very poor substrate for TDG, this is best explained by its ability to excise 5-FU opposite A. This would generate AP-site intermediates, which, through further processing, would give rise to the DNA strand-breaks that become visible upon 5-FU treatment of cells. Although this is a straightforward explanation for the TDG mediated cytotoxicity of 5-FU, other scenarios, such as an effect of its transcription associate function cannot be excluded. Whatever the mechanism, these findings suggest a non-redundant function of TDG in mediating cytotoxicity towards 5-FU. It will therefore be important to examine if the activity of TDG in human tumors correlates with their response to 5-FU-based chemotherapy. The G·T processing function of TDG may also bear chemotherapeutic relevance. T is a substrate for human TDG also when it is mispaired with an O6-methylated guanine (O6-meG·T) [106,107]. O6-meG is a prominent mutagenic and cytotoxic DNA lesion that arises either spontaneously or when cells are exposed to Sn1-type methylating agents, such as Nmethyl-N -nitro-N-nitrosoguanidine (MNNG) that are widely used in cancer chemotherapy. During DNA replication O6-meG pairs with C or T, thus giving rise to O6-meG·T (mis)matches. If generated in excess, the processing of such mispairs by the postreplicative mismatch repair system (MMR) was shown to generate DNA strand-breaks, chromosomal instability and eventually trigger cell death [108]. Given the ability of TDG to act on the same substrate, it has the potential to compete with MMR and, thus, to affect the cytotoxicity of O6menthylguanine-inducing drugs; an interesting hypothesis that merits investigation.
4.
Concluding remarks
This has become a rather extended review, mainly because of the uncertainty about the biological function of TDG (Fig. 3). There are currently quite a few possibilities suggested by experimental evidence that merit careful consideration. Judged from its structure and biochemical properties, TDG is a DNA glycosylase involved in the repair of damaged DNA bases; judged from its interactions with other proteins, it is a co-regulator of gene expression; and judged from the phenotypes of fission yeast and mouse mutants, it may indeed be
Fig. 3 – Biological processes with a possible involvement of TDG. The diagram illustrates implicated biological functions of TDG along with the relevant experimental observations. Biochemical properties suggest a role of TDG in the repair of DNA base damage (substrate are listed in descending order of cleavage efficiency). Interactions with transcription factors and the lethality of Tdg deficient mouse embryos suggest a function of TDG in gene regulation, cell differentiation and development. The putative 5-meC glycosylase activity of TDG may indicate that the gene regulatory function involves changes in DNA methylation and, thus, chromatin structure. TDG might also contribute to somatic hypermutation (SHM) and/or class switch recombination (CSR); it processes G·U with high efficiency and protects the AP-site until it is actively induced to dissociate by SUMOylation. <=>, interaction; *, A:U is substrate only for S. pombe Thp1p.
doing both, repairing DNA base damage and regulating gene expression. Any one of these activities appears to be required for normal mouse embryonic development, setting TDG functionally apart from all other DNA glycosylases that have been genetically studied in mouse. Future research will have to address whether TDG is indeed a multifunctional protein that repairs DNA base damage at one place and controls gene activity at another, or whether the regulation of gene expression involves an as yet unknown function of TDG mediated BER.
Acknowledgements First of all, thank you Joe for a very interesting and still puzzling discovery. Many thanks also to Drs Tetsuya Ono and Adrian Bird for providing unpublished information on the mouse knock out phenotype of TDG. The generous support of the Swiss National Science Foundation and the Association for International Cancer Research (AICR), and the Krebsliga Beider Basel is gratefully acknowledged.
references
[1] T. Lindahl, Instability and decay of the primary structure of DNA, Nature 362 (1993) 709–715.
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
[2] T.C. Brown, J. Jiricny, Different base/base mispairs are corrected with different efficiencies and specificities in monkey kidney cells, Cell 54 (1988) 705–711. [3] K. Wiebauer, J. Jiricny, In vitro correction of G/T mispairs to G/C pairs in nuclear extracts from human cells, Nature 339 (1989) 234–236. [4] P. Neddermann, J. Jiricny, The purification of a mismatch-specific thymine-DNA glycosylase from HeLa cells, J. Biol. Chem. 268 (1993) 21218–21224. [5] P. Neddermann, P. Gallinari, T. Lettieri, D. Schmid, O. Truong, J.J. Hsuan, K. Wiebauer, J. Jiricny, Cloning and expression of human G/T mismatch-specific thymine-DNA glycosylase, J. Biol. Chem. 271 (1996) 12767–12774. [6] P. Neddermann, J. Jiricny, Efficient removal of uracil from G.U mispairs by the mismatch-specific thymine DNA glycosylase from HeLa cells, Proc. Natl. Acad. Sci. U.S.A. 91 (1994) 1642–1646. [7] P. Gallinari, J. Jiricny, A new class of uracil-DNA glycosylases related to human thymine-DNA glycosylase, Nature 383 (1996) 735–738. ¨ The versatile [8] U. Hardeland, M. Bentele, J. Jiricny, P. Schar, thymine DNA-glycosylase: a comparative characterization of the human, Drosophila and fission yeast orthologs, Nucl. Acids Res. 31 (2003) 2261–2271. [9] L. Aravind, E.V. Koonin, The alpha/beta fold uracil DNA glycosylases: a common origin with diverse fates, Genome Biol. 1 (2000) 0007. [10] Z. Gong, T. Morales-Ruiz, R.R. Ariza, T. Roldan-Arjona, L. David, J.K. Zhu, ROS1, a repressor of transcriptional gene silencing in Arabidopsis, encodes a DNA glycosylase/lyase, Cell 111 (2002) 803–814. [11] Y. Choi, M. Gehring, L. Johnson, M. Hannon, J.J. Harada, R.B. Goldberg, S.E. Jacobsen, R.L. Fischer, DEMETER, a DNA glycosylase domain protein, is required for endosperm gene imprinting and seed viability in arabidopsis, Cell 110 (2002) 33–42. [12] R.J. O’Neill, O.V. Vorob’eva, H. Shahbakhti, E. Zmuda, A.S. Bhagwat, G.S. Baldwin, Mismatch uracil glycosylase from Escherichia coli: a general mismatch or a specific DNA glycosylase? J. Biol. Chem. 278 (2003) 20526–20532. ¨ Functionality of human thymine [13] R. Steinacher, P. Schar, DNA glycosylase requires SUMO-regulated changes in protein conformation, Curr. Biol. 15 (2005) 616–623. [14] P.M. Chevray, D. Nathans, Protein interaction cloning in yeast: identification of mammalian proteins that react with the leucine zipper of Jun, Proc. Natl. Acad. Sci. U.S.A. 89 (1992) 5789–5793. [15] S. Um, M. Harbers, A. Benecke, B. Pierrat, R. Losson, P. Chambon, Retinoic acid receptors interact physically and functionally with the T:G mismatch-specific thymine-DNA glycosylase, J. Biol. Chem. 273 (1998) 20728–20736. [16] C. Missero, M.T. Pirro, S. Simeone, M. Pischetola, R. Di Lauro, The DNA glycosylase T:G mismatch-specific thymine DNA glycosylase represses thyroid transcription factor-1-activated transcription, J. Biol. Chem. 276 (2001) 33569–33575. [17] M. Tini, A. Benecke, S.J. Um, J. Torchia, R.M. Evans, P. Chambon, Association of CBP/p300 acetylase and thymine DNA glycosylase links DNA repair and transcription, Mol. Cell 9 (2002) 265–277. ¨ Modification [18] U. Hardeland, R. Steinacher, J. Jiricny, P. Schar, of the human thymine-DNA glycosylase by ubiquitin-like proteins facilitates enzymatic turnover, EMBO J. 21 (2002) 1456–1464. [19] D. Chen, M.J. Lucey, F. Phoenix, J. Lopez-Garcia, S.M. Hart, R. ¨ S. Losson, L. Buluwela, R.C. Coombes, P. Chambon, P. Schar, Ali, T:G mismatch-specific thymine-DNA glycosylase potentiates transcription of estrogen-regulated genes
[20]
[21]
[22]
[23]
[24] [25]
[26]
[27]
[28]
[29] [30]
[31]
[32]
[33]
[34]
501
through direct interaction with estrogen receptor alpha, J. Biol. Chem. (2003). M.J. Lucey, D. Chen, J. Lopez-Garcia, S.M. Hart, F. Phoenix, R. Al-Jehani, J.P. Alao, R. White, K.B. Kindle, R. Losson, P. ¨ D.M. Heery, L. Buluwela, S. Chambon, M.G. Parker, P. Schar, Ali, T:G mismatch-specific thymine-DNA glycosylase (TDG) as a coregulator of transcription interacts with SRC1 family members through a novel tyrosine repeat motif, Nucl. Acids Res. 33 (2005) 6393–6404. T.E. Barrett, R. Savva, G. Panayotou, T. Barlow, T. Brown, J. Jiricny, L.H. Pearl, Crystal structure of a G:T/U mismatch-specific DNA glycosylase: mismatch recognition by complementary-strand interactions, Cell 92 (1998) 117–129. T.E. Barrett, O.D. Scharer, R. Savva, T. Brown, J. Jiricny, G.L. Verdine, L.H. Pearl, Crystal structure of a thwarted mismatch glycosylase DNA repair complex, EMBO J. 18 (1999) 6599–6609. C.D. Mol, A.S. Arvai, R.J. Sanderson, G. Slupphaug, B. Kavli, H.E. Krokan, D.W. Mosbaugh, J.A. Tainer, Crystal structure of human uracil-DNA glycosylase in complex with a protein inhibitor: protein mimicry of DNA, Cell 82 (1995) 701–708. L.H. Pearl, Structure and function in the uracil-DNA glycosylase superfamily, Mutat. Res. 460 (2000) 165–181. J.E. Wibley, T.R. Waters, K. Haushalter, G.L. Verdine, L.H. Pearl, Structure and specificity of the vertebrate anti-mutator uracil-DNA glycosylase SMUG1, Mol. Cell 11 (2003) 1647–1659. ¨ Separating U. Hardeland, M. Bentele, J. Jiricny, P. Schar, substrate recognition from base hydrolysis in human thymine DNA glycosylase by mutational analysis, J. Biol. Chem. 275 (2000) 33449–33456. D. Baba, N. Maita, J.G. Jee, Y. Uchimura, H. Saitoh, K. Sugasawa, F. Hanaoka, H. Tochio, H. Hiroaki, M. Shirakawa, Crystal structure of thymine DNA glycosylase conjugated to SUMO-1, Nature 435 (2005) 979–982. U. Hardeland, M. Bentele, T. Lettieri, R. Steinacher, J. Jiricny, ¨ Thymine DNA glycosylase, Prog. Nucl. Acid Res. P. Schar, Mol. Biol. 68 (2001) 235–253. T.R. Waters, P.F. Swann, Kinetics of the action of thymine DNA glycosylase, J. Biol. Chem. 273 (1998) 20007–20014. M. Saparbaev, S. Langouet, C.V. Privezentzev, F.P. Guengerich, H. Cai, R.H. Elder, J. Laval, 1N(2)-ethenoguanine, a mutagenic DNA adduct, is a primary substrate of Escherichia coli mismatch-specific uracil-DNA glycosylase and human alkylpurine-DNA-N-glycosylase, J. Biol. Chem. 277 (2002) 26987–26993. E. Borys-Brzywczy, K.D. Arczewska, M. Saparbaev, U. ¨ J.T. Kusmierek, Mismatch dependent Hardeland, P. Schar, uracil/thymine-DNA glycosylases excise exocyclic hydroxyethano and hydroxypropano cytosine adducts, Acta Biochim. Pol. 52 (2005) 149–165. J.H. Yoon, S. Iwai, T.R. O’Connor, G.P. Pfeifer, Human thymine DNA glycosylase (TDG) and methyl-CpG-binding protein 4 (MBD4) excise thymine glycol (Tg) from a Tg:G mispair, Nucl. Acids Res. 31 (2003) 5399–5404. B. Zhu, Y. Zheng, D. Hess, H. Angliker, S. Schwarz, M. Siegmann, S. Thiry, J.P. Jost, 5-Methylcytosine-DNA glycosylase activity is present in a cloned G/T mismatch DNA glycosylase associated with the chicken embryo DNA demethylation complex, Proc. Natl. Acad. Sci. U.S.A. 97 (2000) 5135–5139. B. Zhu, D. Benjamin, Y. Zheng, H. Angliker, S. Thiry, M. Siegmann, J.P. Jost, Overexpression of 5-methylcytosine DNA glycosylase in human embryonic kidney cells EcR293 demethylates the promoter of a hormone-regulated reporter gene, Proc. Natl. Acad. Sci. U.S.A. 98 (2001) 5031–5036.
502
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
[35] O.D. Scharer, T. Kawate, P. Gallinari, J. Jiricny, G.L. Verdine, Investigation of the mechanisms of DNA binding of the human G/T glycosylase using designed inhibitors, Proc. Natl. Acad. Sci. U.S.A. 94 (1997) 4878–4883. [36] T.R. Waters, P. Gallinari, J. Jiricny, P.F. Swann, Human thymine DNA glycosylase binds to apurinic sites in DNA but is displaced by human apurinic endonuclease 1, J. Biol. Chem. 274 (1999) 67–74. [37] J.W. Hill, T.K. Hazra, T. Izumi, S. Mitra, Stimulation of human 8-oxoguanine-DNA glycosylase by AP-endonuclease: potential coordination of the initial steps in base excision repair, Nucl. Acids Res. 29 (2001) 430–438. [38] H. Nilsen, K.A. Haushalter, P. Robins, D.E. Barnes, G.L. Verdine, T. Lindahl, Excision of deaminated cytosine from the vertebrate genome: role of the SMUG1 uracil-DNA glycosylase, EMBO J. 20 (2001) 4278–4286. [39] F. Miao, M. Bouziane, T.R. O’Connor, Interaction of the recombinant human methylpurine-DNA glycosylase (MPG protein) with oligodeoxyribonucleotides containing either hypoxanthine or abasic sites, Nucl. Acids Res. 26 (1998) 4034–4041. [40] F. Petronzelli, A. Riccio, G.D. Markham, S.H. Seeholzer, J. Stoerker, M. Genuardi, A.T. Yeung, Y. Matsumoto, A. Bellacosa, Biphasic kinetics of the human DNA repair protein MED1 (MBD4), a mismatch-specific DNA N-glycosylase, J. Biol. Chem. 275 (2000) 32422–32429. [41] K. Krusong, E.P. Carpenter, S.R. Bellamy, R. Savva, G.S. Baldwin, A comparative study of uracil DNA glycosylases from human and herpes simplex virus type 1, J. Biol. Chem. (2005). [42] L. Aravind, D. Landsman, AT-hook motifs identified in a wide variety of DNA-binding proteins, Nucl. Acids Res. 26 (1998) 4413–4421. [43] R. Reeves, Molecular biology of HMGA proteins: hubs of nuclear function, Gene 277 (2001) 63–81. [44] A.E. Vidal, I.D. Hickson, S. Boiteux, J.P. Radicella, Mechanism of stimulation of the DNA glycosylase activity of hOGG1 by the major human AP endonuclease: bypass of the AP lyase activity step, Nucl. Acids Res. 29 (2001) 1285–1292. [45] R.T. Hay, SUMO: a history of modification, Mol. Cell 18 (2005) 1–12. [46] R.J. Dohmen, SUMO protein modification, Biochim. Biophys. Acta 1695 (2004) 113–131. [47] G. Gill, SUMO and ubiquitin in the nucleus: different functions, similar mechanisms? Genes Dev. 18 (2004) 2046–2059. [48] H. Takahashi, S. Hatakeyama, H. Saitoh, K.I. Nakayama, Noncovalent SUMO-1 binding activity of thymine DNA glycosylase (TDG) is required for its SUMO-1 modification and colocalization with the promyelocytic leukemia protein, J. Biol. Chem. 280 (2005) 5611–5621. [49] J. Song, L.K. Durrin, T.A. Wilkinson, T.G. Krontiris, Y. Chen, Identification of a SUMO-binding motif that recognizes SUMO-modified proteins, Proc. Natl. Acad. Sci. U.S.A. 101 (2004) 14373–14378. [50] F. Melchior, M. Schergaut, A. Pichler, SUMO: ligases, isopeptidases and nuclear pores, Trends Biochem. Sci. 28 (2003) 612–618. [51] M. Ehrlich, M.A. Gama-Sosa, L.H. Huang, R.M. Midgett, K.C. Kuo, R.A. McCune, C. Gehrke, Amount and distribution of 5-methylcytosine in human DNA from different types of tissues of cells, Nucl. Acids Res. 10 (1982) 2709–2721. [52] L.M. Field, F. Lyko, M. Mandrioli, G. Prantera, DNA methylation in insects, Insect Mol. Biol. 13 (2004) 109–115. [53] F. Antequera, M. Tamame, J.R. Villanueva, T. Santos, DNA methylation in the fungi, J. Biol. Chem. 259 (1984) 8033–8036. [54] B. Hendrich, U. Hardeland, H.H. Ng, J. Jiricny, A. Bird, The thymine glycosylase MBD4 can bind to the product of
[55] [56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]
[64]
[65]
[66] [67]
[68]
[69]
[70]
[71]
[72]
[73]
deamination at methylated CpG sites, Nature 401 (1999) 301–304. A. Bellacosa, Role of MED1 (MBD4) gene in DNA repair and human cancer, J. Cell Physiol. 187 (2001) 137–144. C.B. Millar, J. Guy, O.J. Sansom, J. Selfridge, E. MacDougall, B. Hendrich, P.D. Keightley, S.M. Bishop, A.R. Clarke, A. Bird, Enhanced CpG mutability and tumorigenesis in MBD4-deficient mice, Science 297 (2002) 403–405. M.S. Greenblatt, W.P. Bennett, M. Hollstein, C.C. Harris, Mutations in the p53 tumor suppressor gene: clues to cancer etiology and molecular pathogenesis, Cancer Res. 54 (1994) 4855–4878. E. Kondo, Z. Gu, A. Horii, S. Fukushige, The thymine DNA glycosylase MBD4 represses transcription and is associated with methylated p16(INK4a) and hMLH1 genes, Mol. Cell. Biol. 25 (2005) 4388–4396. G.P. Pfeifer, Mutagenesis at methylated CpG sequences, Curr. Top. Microbiol. Immunol. 301 (2006) 259–281. M. Otterlei, E. Warbrick, T.A. Nagelhus, T. Haug, G. Slupphaug, M. Akbari, P.A. Aas, K. Steinsbekk, O. Bakke, H.E. Krokan, Post-replicative base excision repair in replication foci, EMBO J. 18 (1999) 3834–3844. H. Nilsen, I. Rosewell, P. Robins, C. Skjelbred, S. Andersen, G. Slupphaug, G. Daly, H.E. Krokan, T. Lindahl, D.E. Barnes, Uracil-DNA glycosylase (UNG)-deficient mice reveal a primary role of the enzyme during DNA replication, Mol. Cell 5 (2000) 1059–1065. Q. An, P. Robins, T. Lindahl, D.E. Barnes, C → T mutagenesis and gamma-radiation sensitivity due to deficiency in the Smug1 and Ung DNA glycosylases, EMBO J. 24 (2005) 2205–2213. G.S. Lee, V.L. Brandt, D.B. Roth, B cell development leads off with a base hit: dU:dG mismatches in class switching and hypermutation, Mol. Cell 16 (2004) 505–508. C. Rada, G.T. Williams, H. Nilsen, D.E. Barnes, T. Lindahl, M.S. Neuberger, Immunoglobulin isotype switching is inhibited and somatic hypermutation perturbed in UNG-deficient mice, Curr. Biol. 12 (2002) 1748–1755. J.M. Di Noia, C. Rada, M.S. Neuberger, SMUG1 is able to excise uracil from immunoglobulin genes: insight into mutation versus repair, EMBO J. 25 (2006) 585–595. H.M. Bolt, Roles of etheno-DNA adducts in tumorigenicity of olefins, Crit. Rev. Toxicol. 18 (1988) 299–309. M. Saparbaev, J. Laval, Enzymology of the repair of etheno adducts in mammalian cells and in Escherichia coli, IARC Sci. Publ. 150 (1999) 249–261. L.J. Marnett, P.C. Burcham, Endogenous DNA adducts: potential and paradox, Chem. Res. Toxicol. 6 (1993) 771–785. J. Nair, A. Barbin, I. Velic, H. Bartsch, Etheno DNA-base adducts from endogenous reactive species, Mutat. Res. 424 (1999) 59–69. M. Saparbaev, J. Laval, 3,N4 -Ethenocytosine, a highly mutagenic adduct, is a primary substrate for Escherichia coli double-stranded uracil-DNA glycosylase and human mismatch-specific thymine-DNA glycosylase, Proc. Natl. Acad. Sci. U.S.A. 95 (1998) 8508–8513. E. Lutsenko, A.S. Bhagwat, The role of the Escherichia coli mug protein in the removal of uracil and 3,N(4)-ethenocytosine from DNA, J. Biol. Chem. 274 (1999) 31034–31038. J. Jurado, A. Maciejewska, J. Krwawicz, J. Laval, M.K. Saparbaev, Role of mismatch-specific uracil-DNA glycosylase in repair of 3,N4 -ethenocytosine in vivo, DNA Rep. 3 (2004) 1579–1590. B. Kavli, O. Sundheim, M. Akbari, M. Otterlei, H. Nilsen, F. Skorpen, P.A. Aas, L. Hagen, H.E. Krokan, G. Slupphaug,
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
[74]
[75]
[76] [77] [78]
[79] [80]
[81]
[82]
[83]
[84]
[85]
[86]
[87]
[88]
[89]
[90]
[91]
hUNG2 is the major repair enzyme for removal of uracil from U:A matches, U:G mismatches, and U in single-stranded DNA, with hSMUG1 as a broad specificity backup, J. Biol. Chem. 277 (2002) 39926–39936. F. Petronzelli, A. Riccio, G.D. Markham, S.H. Seeholzer, M. Genuardi, M. Karbowski, A.T. Yeung, Y. Matsumoto, A. Bellacosa, Investigation of the substrate spectrum of the human mismatch-specific DNA N-glycosylase MED1 (MBD4): fundamental role of the catalytic domain, J. Cell Physiol. 185 (2000) 473–480. J. Bastien, C. Rochette-Egly, Nuclear retinoid receptors and the transcription of retinoid-target genes, Gene 328 (2004) 1–16. E. Kalkhoven, CBP and p300: HATs for different occasions, Biochem. Pharmacol. 68 (2004) 1145–1155. A. Bird, DNA methylation patterns and epigenetic memory, Genes Dev. 16 (2002) 6–21. W. Reik, W. Dean, J. Walter, Epigenetic reprogramming in mammalian development, Science 293 (2001) 1089–1093. K.D. Robertson, DNA methylation and human disease, Nat. Rev. Genet. 6 (2005) 597–610. A. Jeltsch, Molecular enzymology of mammalian DNA methyltransferases, Curr. Top. Microbiol. Immunol. 301 (2006) 203–225. S.K. Bhattacharya, S. Ramchandani, N. Cervoni, M. Szyf, A mammalian protein with specific demethylase activity for mCpG DNA, Nature 397 (1999) 579–583. J.P. Jost, Nuclear extracts of chicken embryos promote an active demethylation of DNA by excision repair of 5-methyldeoxycytidine, Proc. Natl. Acad. Sci. U.S.A. 90 (1993) 4684–4688. J.P. Jost, Y.C. Jost, Transient DNA demethylation in differentiating mouse myoblasts correlates with higher activity of 5-methyldeoxycytidine excision repair, J. Biol. Chem. 269 (1994) 10040–10043. J.P. Jost, Y.C. Jost, Mechanism of active DNA demethylation during embryonic development and cellular differentiation in vertebrates, Gene 157 (1995) 265–266. M. Fremont, M. Siegmann, S. Gaulis, R. Matthies, D. Hess, J.P. Jost, Demethylation of DNA by purified chick embryo 5-methylcytosine-DNA glycosylase requires both protein and RNA, Nucl. Acids Res. 25 (1997) 2375–2380. J.P. Jost, M. Fremont, M. Siegmann, J. Hofsteenge, The RNA moiety of chick embryo 5-methylcytosine-DNA glycosylase targets DNA demethylation, Nucl. Acids Res. 25 (1997) 4545–4550. J.P. Jost, S. Schwarz, D. Hess, H. Angliker, F.V. Fuller-Pace, H. Stahl, S. Thiry, M. Siegmann, A chicken embryo protein related to the mammalian DEAD box protein p68 is tightly associated with the highly purified protein–RNA complex of 5-MeC-DNA glycosylase, Nucl. Acids Res. 27 (1999) 3245–3252. J.P. Jost, E.J. Oakeley, B. Zhu, D. Benjamin, S. Thiry, M. Siegmann, Y.C. Jost, 5-Methylcytosine DNA glycosylase participates in the genome-wide loss of DNA methylation occurring during mouse myoblast differentiation, Nucl. Acids Res. 29 (2001) 4452–4461. J.P. Jost, M. Siegmann, L. Sun, R. Leung, Mechanisms of DNA demethylation in chicken embryos. Purification and properties of a 5-methylcytosine-DNA glycosylase, J. Biol. Chem. 270 (1995) 9734–9739. H.D. Morgan, W. Dean, H.A. Coker, W. Reik, S.K. Petersen-Mahrt, Activation-induced cytidine deaminase deaminates 5-methylcytosine in DNA and is expressed in pluripotent tissues: implications for epigenetic reprogramming, J. Biol. Chem. 279 (2004) 52353–52360. R.S. Johnson, B. van Lingen, V.E. Papaioannou, B.M. Spiegelman, A null mutation at the c-jun locus causes
[92]
[93]
[94]
[95]
[96]
[97]
[98]
[99]
[100]
[101]
[102]
[103]
[104]
[105]
[106]
503
embryonic lethality and retarded cell growth in culture, Genes Dev. 7 (1993) 1309–1317. P. Kastner, J.M. Grondona, M. Mark, A. Gansmuller, M. LeMeur, D. Decimo, J.L. Vonesch, P. Dolle, P. Chambon, Genetic analysis of RXR alpha developmental function: convergence of RXR and RAR signaling pathways in heart and eye morphogenesis, Cell 78 (1994) 987–1003. Y. Oike, N. Takakura, A. Hata, T. Kaname, M. Akizuki, Y. Yamaguchi, H. Yasue, K. Araki, K. Yamamura, T. Suda, Mice homozygous for a truncated form of CREB-binding protein exhibit defects in hematopoiesis and vasculo-angiogenesis, Blood 93 (1999) 2771–2779. A.F. Wilks, P.J. Cozens, I.W. Mattaj, J.P. Jost, Estrogen induces a demethylation at the 5 end region of the chicken vitellogenin gene, Proc. Natl. Acad. Sci. U.S.A. 79 (1982) 4252–4555. ´ K. Niederreither, M. Harbers, P. Chambon, P. Dolle, Expression of T:G mismatch-specific thymidine-DNA glycosylase and DNA methyl transferase genes during development and tumorigenesis, Oncogene 17 (1998) 1577–1585. O. Minowa, T. Arai, M. Hirano, Y. Monden, S. Nakai, M. Fukuda, M. Itoh, H. Takano, Y. Hippou, H. Aburatani, K. Masumura, T. Nohmi, S. Nishimura, T. Noda, Mmh/Ogg1 gene inactivation results in accumulation of 8-hydroxyguanine in mice, Proc. Natl. Acad. Sci. U.S.A. 97 (2000) 4156–4161. M. Takao, S. Kanno, T. Shiromoto, R. Hasegawa, H. Ide, S. Ikeda, A.H. Sarker, S. Seki, J.Z. Xing, X.C. Le, M. Weinfeld, K. Kobayashi, J. Miyazaki, M. Muijtjens, J.H. Hoeijmakers, G. van der Horst, A. Yasui, Novel nuclear and mitochondrial glycosylases revealed by disruption of the mouse Nth1 gene encoding an endonuclease III homolog for repair of thymine glycols, EMBO J. 21 (2002) 3486–3493. B.P. Engelward, G. Weeda, M.D. Wyatt, J.L. Broekhof, J. de Wit, I. Donker, J.M. Allan, B. Gold, J.H. Hoeijmakers, L.D. Samson, Base excision repair deficient mice lacking the Aag alkyladenine DNA glycosylase, Proc. Natl. Acad. Sci. U.S.A. 94 (1997) 13087–13092. R. Jaenisch, A. Bird, Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals, Nat. Genet. 33 (2003) 245–254. E. Li, T.H. Bestor, R. Jaenisch, Targeted mutation of the DNA methyltransferase gene results in embryonic lethality, Cell 69 (1992) 915–926. M. Okano, D.W. Bell, D.A. Haber, E. Li, DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development, Cell 99 (1999) 247–257. D. Biniszkiewicz, J. Gribnau, B. Ramsahoye, F. Gaudet, K. Eggan, D. Humpherys, M.A. Mastrangelo, Z. Jun, J. Walter, R. Jaenisch, Dnmt1 overexpression causes genomic hypermethylation, loss of imprinting, and embryonic lethality, Mol. Cell. Biol. 22 (2002) 2124–2135. L. Sard, S. Tornielli, P. Gallinari, F. Minoletti, J. Jiricny, T. Lettieri, M.A. Pierotti, G. Sozzi, P. Radice, Chromosomal localizations and molecular analysis of TDG gene-related sequences, Genomics 44 (1997) 222–226. C. Schmuttle, P.A. Jones, Involvement of DNA methylation in human carcinogenesis, Biol. Chem. 379 (1998) 377–388. D.B. Longley, D.P. Harkin, P.G. Johnston, 5-Fluorouracil: mechanisms of action and clinical strategies, Nat. Rev. Cancer 3 (2003) 330–338. U. Sibghat, P. Gallinari, Y.Z. Xu, M.F. Goodman, L.B. Bloom, J. Jiricny, R. Day III, Base analog and neighboring base effects on substrate specificity of recombinant human G:T mismatch-specific thymine DNA-glycosylase, Biochemistry 35 (1996) 12926–12932.
504
d n a r e p a i r 6 ( 2 0 0 7 ) 489–504
[107] S.U. Lari, F. Al-Khodairy, M.C. Paterson, Substrate specificity and sequence preference of G:T mismatch repair: incision at G:T, O6-methylguanine:T, and G:U mispairs in DNA by human cell extracts, Biochemistry 41 (2002) 9248–9255.
[108] B. Kaina, A. Ziouta, K. Ochs, T. Coquerelle, Chromosomal instability, reproductive cell death and apoptosis induced by O6-methylguanine in Mex−, Mex+ and methylation-tolerant mismatch repair compromised cells: facts and models, Mutat. Res. 381 (1997) 227–241.