DNA Repair 12 (2013) 890–898
Contents lists available at ScienceDirect
DNA Repair journal homepage: www.elsevier.com/locate/dnarepair
Molecular characterization of a putative plant homolog of MBD4 DNA glycosylase Ángel Ramiro-Merina, Rafael R. Ariza, Teresa Roldán-Arjona ∗ Department of Genetics, University of Córdoba/Maimónides Institute for Research in Biomedicine of Córdoba (IMIBIC)/Reina Sofía University Hospital, 14071 Córdoba, Spain
a r t i c l e
i n f o
Article history: Received 7 June 2013 Received in revised form 2 August 2013 Accepted 7 August 2013 Available online 30 August 2013 Keywords: Base excision repair DNA methylation Deamination Mismatches
a b s t r a c t Methyl-CpG-binding domain 4 (MBD4) DNA glycosylase is involved in excision of spontaneous deamination products of cytosine and 5-methylcytosine in animals, but it is unknown whether related proteins perform similar functions in plants. We report here the isolation and biochemical characterization of a putative MBD4 homolog from Arabidopsis thaliana, designated as MBD4L (MBD4-like). The plant enzyme lacks the MBD domain present in mammalian MBD4 proteins, but conserves a DNA glycosylase domain with critical residues for substrate recognition and catalysis, and it is more closely related to MBD4 homologs than to other members of the HhH-GPD superfamily. Arabidopsis MBD4L excises uracil and thymine opposite G, and the presence of halogen substituents at C5 of the target base greatly increases its excision efficiency. No significant activity is detected on cytosine derivatives such as 5-methylcytosine or 5-hydroxymethylcytosine. The enzyme binds to the abasic site product generated after excision, which decreases its catalytic turnover in vitro. Both the full-length protein and a N-terminal truncated version retaining the catalytic domain exhibit a preference for a CpG sequence context, where most plant DNA methylation is found. Our results suggest that an important function of Arabidopsis MBD4L is to protect the plant genome from the mutagenic consequences of cytosine and 5-methylcytosine deamination. © 2013 Elsevier B.V. All rights reserved.
1. Introduction Chemical instability of DNA is a constant, unavoidable source of endogenous damage in cells [1]. An important type of DNA base alteration is spontaneous hydrolytic deamination, which converts cytosine (C) and 5-methycytosine (5-meC) to uracil (U) and thymine (T), respectively. Such events are predicted to occur at significant rates in cells [2,3] and, if the resulting U:G and T:G mispairs are not corrected before DNA replication, they lead to C:G to T:A transitions [4]. DNA deamination damage is normally handled by the Base Excision Repair (BER) pathway. BER is initiated by DNA glycosylases that excise the incorrect base and generate an abasic (apurinic/apyrimidinic, AP) site that is further processed before DNA synthesis and ligation restore DNA to the unmodified state [5]. Repair of uracil is competently performed by ubiquitous DNA glycosylases from the UDG superfamily [6] and by some members of the HhH-GPD superfamily [7–9]. Processing of T:G mispairs is comparatively less efficient [10], and 5-meC is more liable to deamination than C [11]. In fact, hydrolytic deamination of 5-meC is presumed to
∗ Corresponding author at: Department of Genetics, Edificio Gregor Mendel, Campus de Rabanales s/n, University of Córdoba, 14071 Córdoba, Spain. Tel.: +34 957 218 979; fax: +34 957 212 072. E-mail address:
[email protected] (T. Roldán-Arjona). 1568-7864/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.dnarep.2013.08.002
be a major source of human mutations, since a substantial proportion of germline point mutations associated to hereditary diseases and somatic mutations in tumors are C:G to T:A transitions at CpG sites, where 5-meC is predominantly found [12]. In mammalian cells, two DNA glycosylases that excise T from T:G mismatches have been identified: thymine DNA glycosylase (TDG) [13,14] and methyl binding 4 DNA glycosylase (MBD4, also known as MED1) [15–17]. Although best known for its ability to remove T from T:G mispairs, both enzymes actually process U:G mispairs more efficiently [18,19]. TDG belongs to the UDG superfamily, and MBD4 is a member of the HhH-GPD superfamily, but despite the complete lack of sequence homology both enzymes have similar substrate specificities [17,19,20]. Mammalian MBD4 was identified as a member of a family of mammalian methyl-CpG binding proteins [21] and also as an interactor of the human mismatch repair protein MLH1 [16]. The enzyme contains an N-terminal methyl-CpG-binding domain (MBD) and a C-terminal DNA glycosylase domain with homology to members of the HhH-GPD superfamily [16,21,22]. MBD4 is a monofunctional DNA glycosylase that excises U and T mispaired to G, with a preference for mismatches at a CpG context [15,17,18,20]. Like TDG, it exhibits a higher activity on halogenated uracil derivatives [20,23]. MBD4 has a very weak activity on 5-meC [24], but excises at a significant rate 5-HmeU [25–27], which has been proposed as an intermediate in a multistep active DNA demethylation
Á. Ramiro-Merina et al. / DNA Repair 12 (2013) 890–898
pathway mediated by TET proteins and AID/APOBEC deaminases [28]. MBD4-deficient mice are viable and fertile, but show a 3-fold increase of C to T transitions at CpG sites, as well as accelerated tumorigenesis when heterozygous for a mutant allele of the adenomatous polyposis coli (Apc) gene [29,30]. By other hand, mutated variants of MBD4 are frequent in human carcinomas with microsatellite instability [31,32], suggesting a possible role in tumor progression. There is also evidence for a function of MBD4 in apoptotic signaling, since MBD4-deficient mouse embryonic fibroblasts fail to undergo apoptosis upon treatment with genotoxic agents [33]. The multifunctional nature of MBD4 in mammalian cells raises the question of whether related proteins play similar roles in other organisms. Putative MBD4-like proteins with a conserved DNA glycosylase domain are found in animals, fungi, plants and stramenopiles [34]. As a first step to explore the significance of MBD4-like proteins in non-metazoan eukaryotes, we have set out to isolate and characterize the biochemical properties of a putative Arabidopsis homolog of MBD4. Plant possess a large family of proteins with a conserved MBD but no DNA glycosylase domain, and one of its members has been called MBD4 [35,36]. To avoid nomenclature confusion, we have designated the Arabidopsis putative homolog of vertebrate MBD4 DNA glycosylase as MBD4L (MBD4like). 2. Materials and methods 2.1. Sequence analysis Identification of potential MBD4 homologs was carried out by BLAST [37] searches at the National Center for Biotechnology Information (NCBI), using the amino acid sequence of human MBD4 as query against the Arabidopsis protein and nucleotide databases. Homology of the protein sequences retrieved from the BLAST search was analyzed by multiple sequence alignment with M-Coffee [38]. The alignment was viewed, adjusted and refined manually with Vector NTI Software (Invitrogen). Sequence feature analysis was performed with InterProScan [39] and DISOPRED [40]. A 3D model structure of the aligned region from Arabidopsis MBD4L (amino acids 307–445) was built using Swiss-Model [41] and the 3D structures of Mus musculus and human MBD4 (Protein Data Bank accession codes: 4EVV and 4DK9 [26,42]) as templates. Nucleic acid coordinates extracted from 4DK9 were used to superimpose a DNA structure with a flipped-out AP site analog onto the Arabidopsis MBD4L model. The structural figures were prepared with PyMOL (http://www.pymol.org/). 2.2. Expression and purification of MBD4L and its deletion derivative 290MBD4L Full-length MBD4L (FL-MBD4L) cDNA was inserted into the pMAL-c2X expression vector (New England Biolabs) to obtain a malE- in-frame fusion. A MBD4L deletion construct containing the DNA glycosylase domain (290MBD4L), was generated by performing PCR on MBD4L cDNA with primers AtMBD4 F3 EcoRI and AtMBD4 R4 SalI (Supplemental Table 1). Expression of fulllength MBD4L and 290MBD4L was carried out in Escherichia coli BL21 (DE3) ung-151 cells [43]. A fresh single transformant colony was inoculated into 10 mL of LB medium supplemented with glucose (2 g/L) and containing carbenicillin (50 g/mL), tetracycline (10 g/mL) and chloramphenicol (34 g/mL) and the culture was incubated at 37 ◦ C overnight with shaking. A 2 mL aliquot of the overnight culture was inoculated into 200 mL of LB medium supplemented with glucose (2 g/L) and containing
891
carbenicillin (50 g/mL), tetracycline (10 g/mL) and chloramphenicol (34 g/mL), and incubated at 37 ◦ C, 250 rpm, until the A600 was 0.2. The culture was then placed at 15 ◦ C and incubation continued at 250 rpm until the A600 reached 0.6. The expression was induced by adding isopropyl-1-thio--d-galactopyranoside (IPTG) to 300 M and incubating for 16 h. After induction, cells were collected by centrifugation at 13,000 × g for 30 min. The pellet was resuspended in 5 mL of column buffer (20 mM Tris–HCl, pH 7.4, 200 mM NaCl, 1 mM dithiothreitol, 1 mM EDTA) and frozen at −80 ◦ C. The stored pellet was thawed in the presence of 5 g/mL DNase I (Roche) and 1 mg/mL lysozyme (Roche) on ice for 15 min, cells were additionally disrupted by sonication and the lysate was clarified by centrifugation. The supernatant was loaded onto an MBPTrapTM HP column (GE Healthcare) and the recombinant proteins were purified following standard protocols. Proteins were eluted in elution buffer (10 mM maltose, 20 mM Tris–HCl, pH 7.4, 200 mM NaCl, 1 mM dithiothreitol, 1 mM EDTA) and collected in 0.5 mL fractions. An aliquot of each fraction was analyzed by SDSPAGE, and those containing a single band were pooled and dialyzed against dialysis buffer (20 mM Tris–HCl, pH 7.4, 200 mM NaCl, 1 mM dithiothreitol, 1 mM EDTA, 50% glycerol). The protein preparation was divided into aliquots and stored at −80 ◦ C. All steps were carried out at 4 ◦ C or on ice. Protein concentrations were determined by the Bradford assay. Denatured proteins were analyzed by SDS-PAGE using broad range molecular weight standards (Bio-Rad).
2.3. Site-directed mutagenesis Site-directed mutagenesis of 290MBD4L was performed using the Quick-Change Site-Directed Mutagenesis Kit (Stratagene) according to the manufacturer’s instructions. The D429A mutation was introduced into pMAL-c2X-290MBD4L by using the oligonucleotides AtMBD4 D429A F and AtMBD4 D429A R (Table S1). The mutant sequence was confirmed by DNA sequencing and the construct was used to transform E. coli strain BL21(DE3) ung-151. Mutant protein MBP-D429A-290MBD4L was overexpressed and purified as described above.
2.4. DNA substrates Oligonucleotides used as DNA substrates (Supplemental Table 2) were synthesized by Operon Biotech (http://www.operonbiotech.com) or Integrated DNA Technologies (http://eu.idtdna.com/) and purified by PAGE before use. Doublestranded DNA substrates were prepared by mixing a 5 M solution of a 5 -fluorescein-labeled oligonucleotide (upper strand) with a 10 M solution of an unlabeled oligomer (lower strand), heating to 95 ◦ C for 5 min, and slowly cooling to room temperature.
2.5. Electrophoretic mobility shift assay (EMSA) EMSA was performed using a fluorescein-labeled duplex oligonucleotide containing either a synthetic AP site opposite guanine (AP:G) or a C:G pair, prepared as described above. DNAbinding reaction mixtures (10 L) contained 100 nM of labeled duplex substrate and different amounts of MBP-MBD4L and MBP290MBD4L (0, 1.66, 3.33 and 5 M) in 10 mM Tris–HCl pH 8.0, 1 mM DTT, 10 g/mL BSA, 1 mM EDTA. After 15 min incubation at 30 ◦ C, reactions were immediately loaded onto 8% polyacrylamide gels in 0.5× TBE. Electrophoresis was carried out in 0.5× TBE for 60 min at 80 V at room temperature. Fluorescein-labeled DNA was visualized using the blue fluorescence mode of the FLA-5100 imager and analyzed using Multigauge software (Fujifilm).
892
Á. Ramiro-Merina et al. / DNA Repair 12 (2013) 890–898
A Methyl-CpGbinding domain (MBD) 76
1
Glycosylase domain
151
437
580 H. sapiens MBD4
1
63
138
411
554 M. musculus MBD4
273
1
416 G. gallus MBD4
306
1
445 A. thaliana MBD4L
1 16
155 A. thaliana ∆290MBD4L
B
Δ290MBD4L
Modeled region
Disordered region 260
A. thaliana MBD4L G. gallus MBD4 H. sapiens MBD4 M. musculus MBD4
(250) (217) (383) (359)
(350) (317) (481) (455)
270
280
290
300
310
α1 320
α2 330
α3 340
YFKVVKVSRYFHADGIQVNESQKEKSRNVRKTPIVSPVLSLSQKTDDVYLRKTPDNTWVPPRSPCNLLQEDHWHDPWRVLVICMLLNKTSGAQTRGVISD K DVSWPSDKKSFTAVQAPRGTEESAPRTQVDRRKTSPYFSSKYSKEALSPPRRKAFRKWTPPRSPFNLVQETLFHDPWKLLIATIFLNKTSGKMAIPVLWE DNNCSPTRKDFTG--EKIFQEDTIPRTQIERRKTSLYFSSKYNKEALSPPRRKAFKKWTPPRSPFNLVQETLFHDPWKLLIATIFLNRTSGKMAIPVLWK P-SCSQAKKHFT---SETFQEDSIPRTQVEKRKTSLYFSSKYNKEALSPPRRKSFKKWTPPRSPFNLVQEILFHDPWKLLIATIFLNRTSGKMAIPVLWE
α3
A. thaliana MBD4L G. gallus MBD4 H. sapiens MBD4 M. musculus MBD4
*********************
α4 360
α5 370
α6 380
390
α7 400
HhH- motif α8 410
α9 420
α10 430
440
LGL KRTKMIQRLSLEYLQESWTHVTQLHGVGKYAADAYAIFCNGNWDRVKPNDH D MLNYYWDYLRIRYKL---LFGLCTDAKTATEVKEEEIENLIKPLGLQK FLRKYPSPEVARTADWKEMSELLRPLGLYALRAKTIIKFSDEYLNKQWKYPIELHGIGKYGNDSYRIFCVNEWKEVQPQDHKLNIYHAWLWENHEKLSVD FLEKYPSAEVARTADWRDVSELLKPLGLYDLRAKTIVKFSDEYLTKQWKYPIELHGIGKYGNDSYRIFCVNEWKQVHPEDHKLNKYHDWLWENHEKLSLS FLEKYPSAEVARAADWRDVSELLKPLGLYDLRAKTIIKFSDEYLTKQWRYPIELHGIGKYGNDSYRIFCVNEWKQVHPEDHKLNKYHDWLWENHEKLSLS
Fig. 1. Sequence conservation of the DNA glycosylase domain in MBD4 homologs. (A) Schematic alignment of the full-length amino acid sequences of Arabidopsis MBD4L and its homologs. Conserved domains are colored: MBD, green and DNA glycosylase, red. (B) Multiple sequence alignment of A. thaliana MBD4L (NP 974253.1) (amino acids 250–445) and MBD4 proteins from G. gallus (NP 990024.1), H. sapiens (AAC68879.1) and M. musculus (AAC68878.1). Inverted triangles indicate positions of critical human MBD4 residues involved in substrate recognition and catalysis, and the corresponding Arabidopsis MBD4L residues are shown in color (see text for details). Arrows indicate the deletion of 290 amino acids in 290MBD4L and the sequence region modeled in Supplemental Fig. 3. The predicted secondary structure of Arabidopsis MBD4L is indicated above the sequence. The helix-hairpin-helix of the HhH-GPD motif is shown in blue.
2.6. Enzyme activity assays Double-stranded oligodeoxynucleotides (40 nM) were incubated at 30 ◦ C for the indicated times in a reaction mixture containing 50 mM Tris–HCl (pH 8.0), 1 mM EDTA, 1 mM DTT, 0.1 mg/mL BSA, and the indicated amounts of MBP-MBD4L, MBP290MBD4L or MBP-D429A-290MBD4L in a total volume of 50 L. Reactions were stopped by adding 20 mM EDTA, 0.6% sodium dodecyl sulfate, and 0.5 mg/mL proteinase K, and the mixtures were incubated at 37 ◦ C for 30 min. Samples were treated with NaOH 100 mM and immediately transferred to 90 ◦ C for 10 min, and then neutralized by adding 30 mM Tris–HCl pH 8.0. DNA was extracted with phenol:chloroform:isoamyl alcohol (25:24:1) and ethanol precipitated at −20 ◦ C in the presence of 0.3 mM NaCl and 16 mg/mL glycogen. Samples were resuspended in 10 L of 90% formamide and heated at 95 ◦ C for 5 min. Reaction products were separated in a 12% denaturing polyacrylamide gel containing 7 M urea. Fluorescein-labeled DNA was visualized using the blue fluorescence mode of the FLA-5100 imager and analyzed using Multigauge software (Fujifilm).
2.7. Kinetic analysis Data were fitted to the equation [Product] = Pmax [1 − exp(−kt) ] using non-linear regression analysis and the software Sigmaplot. For each mutant enzyme and substrate, the parameters Pmax (maximum substrate processing within an unlimited period of time),
T50 (the time required to reach 50% of the product plateau level, Pmax ), and the relative processing efficiency (Erel = Pmax /T50 ) were determined [44].
3. Results 3.1. Identification of a putative MBD4 homolog in Arabidopsis thaliana To determine whether the A. thaliana genome encodes an MBD4 homolog we queried the Arabidopsis protein and nucleotide sequence databases using the amino acid sequence of human MBD4. The search yielded a significant hit to a predicted gene located on chromosome 3 (At3g07930). A full-length cDNA clone for this locus from the Riken Arabidopsis Full Length (RAFL) collection, [45,46] was obtained from the Arabidopsis Biological Resource Center (ABRC) and sequenced. The Arabidopsis At3g07930 cDNA encodes a protein of 445 amino acids with a C-terminal region (amino acids 306–445) showing a 43% identity with the DNA glycosylase domain of human MBD4 (Fig. 1). We have designated this protein as MBD4L (MBD4-like). Sequence analysis of the N-terminal region (amino acids 1–305) of MBD4L did not detect any similarity to the MBD domain or to any other conserved domain in known protein families. Phylogenetic analysis using full-length protein sequences indicates that Arabidopsis MBD4L is more closely related to members of the MBD4 subfamily than to other members of the HhH-GPD superfamily (Supplemental Fig. 1). A similar analysis
Á. Ramiro-Merina et al. / DNA Repair 12 (2013) 890–898
893
Fig. 2. Arabidopsis MBD4L excises U and T opposite G. (A) Schematic diagram of molecules used as DNA substrates. Double-stranded oligonucleotides contained a lesion (X = T or U) and a fluorescein-labeled 5 end (indicated by an asterisk) on the upper strand. The size of the 5 end-labeled fragment generated after base excision, NaOH and heat treatment is also indicated. (B and C) Time course of T and U excision. MBD4L (B) and 290MBD4L (C) (1 M) were incubated at 30 ◦ C in a 50 L reaction mixture with a 5 -labeled 51-mer oligonucleotide duplex (40 nM) containing either a T:G or a U:G mispair. Reactions were stopped at the indicated times, products were separated in a 12% denaturing polyacrylamide gel and quantified by fluorescence scanning (right).Values are means ± SD (error bars) from two independent experiments.
using only the DNA glycosylase domain yielded analogous results (data not shown). A multiple sequence alignment revealed that sequence conservation among putative plant MBD4L homologs is restricted to the C-terminal DNA glycosylase domain (Supplemental Fig. 2). The DNA glycosylase domain of Arabidopsis MBD4L exhibits a HhH-GPD motif signature and conserves critical residues involved in substrate recognition and catalysis in MBD4 glycosylases (Fig. 1B). The D429 residue downstream the HhH motif corresponds to the catalytic aspartate invariant among HhH-GPD proteins [22]. Residue K337 is at the same position as human MBD4 R468, which is involved in nucleotide flipping by penetrating in the minor groove and expelling the target base [42]. Amino acids L375, G376 and L377 are identical to those that in human MBD4 contribute to specificity for guanine as the pairing partner of the target base [42]. We generated a 3D model of the DNA glycosylase domain of Arabidopsis MBD4L using the crystal structure of mouse and human MBD4 proteins (Protein Data Bank accession codes: 4EVV and 4DK9 [26,42]) as a template (Supplemental Fig. 3). The model predicts a typical HhH-GPD core structure, with the key amino acids mentioned above located in analogous positions to those of the mammalian homologs.
T, generating an abasic site that was cleaved upon alkaline treatment of the reaction products (Fig. 2B and C). The full-length protein exhibited a lower catalytic activity than the truncated version, but also showed a clear preference for U as the target base. We next investigated the effect of changing the base opposite the target on MBD4L activity (Fig. 3A). We found that 290MBD4L exhibits a strict specificity for G opposite the target base in the complementary strand, and does not show detectable activity when the pairing partner of the target base is A, T or C. Analogous results were obtained with full-length MBD4L (data not shown). To verify that the observed DNA glycosylase activity is intrinsic to Arabidopsis MBD4L we generated a mutant 290MBD4L protein in which the conserved aspartic acid residue (D429) in the glycosylase domain (Fig. 3B) was changed to alanine. The mutant protein (D429A290MBD4L) showed a greatly reduced DNA glycosylase activity, generating ∼160-fold less product than the wild-type version. We also confirmed that the enzyme is monofunctional, since it does not incise the AP site generated as a product after base excision (Fig. 3C). Altogether, these results indicate that Arabidopsis MBD4L is a monofunctional DNA glycosylase targeting U:G and T:G mispairs. 3.3. Arabidopsis MBD4L binds the AP site reaction product
3.2. Arabidopsis MBD4L excises U and T opposite G The full-length cDNA containing the entire ORF of Arabidopsis MBD4L and a shorter sequence coding amino acids 291–445 were fused to the maltose binding protein (MBP) to generate MBP-MBD4L and MBP-290MBD4L, respectively. Both recombinant proteins were expressed in E. coli ung− cells, purified to apparent homogeneity (Supplemental Fig. 4) and used in oligonucleotide incision assays (Fig. 2). As a substrate we used a 51-mer DNA duplex in which the 5 -end-labeled upper strand contained either a single U or T residue mispaired with G (Fig. 2A). We found that both MBD4L and 290MBD4L excised U, and less efficiently
Our preliminary biochemical analyses revealed that Arabidopsis MBD4L activity follows a biphasic kinetics, with an initial burst of product accumulation followed by a slower phase (Fig. 2B and C). We decided to explore whether this reflects a low turnover rate. The low activity of the enzyme preparations did not allow to carry out reactions under excess substrate conditions. Therefore we performed experiments at different enzyme concentrations (Supplemental Fig. 5). The results revealed that the amplitude of the initial burst is correlated with the enzyme concentration used. This suggests that product concentration is limited by the concentration of MBD4L DNA glycosylase. In fact, the amount of product formed
894
Á. Ramiro-Merina et al. / DNA Repair 12 (2013) 890–898
could be related to different affinities for the reaction product. To examine this question we performed EMSA competition experiments with an AP:G probe and increasing amounts of unlabeled AP:G duplex (Fig. 4D). We found that the relative affinity of FLMBD4L for the reaction product is lower than that of the truncated 290MBD4L protein. We therefore conclude that MBD4L specifically binds its reaction product, that the N-terminal domain of the protein is not involved in such binding, and that the higher activity of 290MBD4L is not caused by reduced product binding. 3.4. Arabidopsis MBD4L excises uracil derivatives We next explored the target base specificity of Arabidopsis MBD4L by performing oligonucleotide incision assays with DNA substrates containing different 5-substituted uracil and cytosine derivatives, or a purine oxidative lesion (8-oxoG) (Fig. 5). A first analysis revealed that both proteins were unable to process at a significant rate neither 8-oxoG nor cytosine derivatives such as 5-meC or 5-HmeC (Fig. 5B). Kinetic analysis confirmed that MBD4L and 290MBD4L display a substantial activity on 5-halogen uracils, but also process methylated and/or oxidized derivatives, with preferences ordered thus: 5-BrU > 5-FU U > T (5meU) ≈ 5-HmeU ≈ 5-HU (Fig. 5C). The relative catalytic efficiency of 290MBD4L is about 7-fold higher than that of full-length MBD4L, but both proteins exhibit similar substrate specificity. 3.5. Arabidopsis MBD4L has a preference for a CpG sequence context Fig. 3. MBD4L exhibits a strict specificity for G opposite the target base. (A) Effect of opposite base on T and U glycosylase activity. MBP-290MBD4L (1 M) was incubated for 24 h at 30 ◦ C in a 50 L reaction mixture containing 40 nM of 5 -labeled 51-mer oligonucleotide duplex containing T or U opposite A, C, G or T, as indicated. Reactions were stopped and the products were separated in a 12% denaturing polyacrylamide gel. (B) Effect of D429A mutation on enzymatic activity. A DNA duplex (40 nM) containing an U:G mispair was incubated with wild-type 290MBD4L or mutant D429A-290MBD4L (1 M) for 24 h at 30 ◦ C. (C) MBD4L is a monofunctional DNA glycosylase with no AP lyase activity. A DNA duplex (40 nM) containing an U:G mispair was incubated in the absence (lane 1) or the presence of 290MBD4L (1 M) (lanes 2–4) for 24 h at 30 ◦ C. After base excision, the AP-site reaction product was stabilized with NaBH4 (lane 2), heat-treated at 50 ◦ C (lane 3) or NaOH-treated at 95 ◦ C (lane 4). Reaction products were separated and detected as described above.
is much lower than the amount of enzyme used, thus suggesting that only a minor portion of the enzyme preparation is active. In any case, at the lower concentration used, which resemble multiple turnover conditions, the activity reached a plateau where no significant amount of product was generated. To investigate whether this behavior is due to inefficient release of the reaction product, we first performed a competition assay (Fig. 4A). We incubated full-length MBD4L or 290MBD4L either in the presence or the absence of an unlabelled double stranded oligonucleotide containing a synthetic AP site opposite guanine (AP:G), and measured the activity on an equivalent labeled U:G substrate. We found that the incubation with unlabeled AP:GDNA strongly inhibited the processing of the U:G mispair. To test whether MBD4L inhibition results from binding to the AP site, we performed a gel-mobility shift assay incubating either full-length MBD4L or 290MBD4L with a labeled duplex oligonucleotide containing either a synthetic AP site opposite guanine (AP:G) or a C:G pair (Fig. 4B). We found that both proteins formed a specific complex with the DNA duplex containing the AP site, but not with the homoduplex DNA. Furthermore, using the catalytically-inactive D429A-290MBD4L mutant protein, we found that the catalytic domain does not form a detectable complex with a DNA duplex containing a U:G mispair (Fig. 4C), thus confirming an specific binding to the AP site. We next asked whether the lower activity of FL-MBD4L compared to that of 290MBD4L
We next asked whether excision of U and T by Arabidopsis MBD4L is affected by sequence context. Most of mammalian and plant DNA methylation is restricted to symmetrical CG sequences, but plants also have significant levels of cytosine methylation in the symmetric context CHG (where H is A, C or T) and even in asymmetric sequences [47,48]. We therefore incubated full-length MBD4L protein with double-stranded oligonucleotide substrates containing T or U opposite G in a CG, CHG or CHH sequence context. We found that MBD4L processes U:G and T:G mispairs with about 2-fold higher efficiency in a CG context than in CHG or CHH sequences (Fig. 6). The truncated protein 290MBD4L also displays higher activity on a CG context (Supplemental Fig. 6), which suggests that the molecular determinants of such preference reside in the C-terminal DNA glycosylase domain of MBD4L. Since symmetrical sequences may be present in either the hemimethylated or bimethylated state, we also asked whether U or T excision in one strand at CG or CHG sequence contexts is affected by the methylation status of the C in the complementary strand (Supplemental Fig. 6). We found no significant differences in excision efficiency between hemi- and bimethylated sequences, neither for MBD4L nor 290MBD4L proteins. Altogether, these results indicate that the DNA glycosylase domain of Arabidopsis MBD4L displays a preference for those sequences more likely to be methylated, irrespective of the methylation status of the complementary strand. 4. Discussion Phylogenetic analysis supports the view [49] that the ancestral version of MBD4 lacked any MBD, which was later acquired in the metazoan clade and subsequently lost in some lineages (Supplemental Fig. 7). Although a large family of plant proteins with a conserved MBD motif exists, most of its members do not bind specifically to methylated DNA, and none contains a DNA glycosylase domain [36]. Our results indicate that the N-terminal domain of Arabidopsis MBD4L, which is not conserved among plant MBD4L homologs, does not play any direct role in catalysis. The N-terminal
Á. Ramiro-Merina et al. / DNA Repair 12 (2013) 890–898
895
Fig. 4. Arabidopsis MBD4L binds to DNA containing an AP site. (A) Effect of incubation with DNA containing an AP site on MBD4L activity. Purified full-length MBD4L (circles) or 290MBD4L (triangles) (500 nM) were incubated with a fluorescein-labeled U:G substrate (40 nM) in the absence (filled symbols) or presence (open symbols) of 40 nM unlabeled duplex containing a synthetic AP site opposite guanine (AP:G). Reactions were stopped at the indicated times, products were separated in a 12% denaturing polyacrylamide gel, and the relative amount of incised oligonucleotide was quantified by fluorescence scanning. (B) Electrophoretic mobility shift assay. Increasing amounts of MBD4L or 290MBD4L purified proteins were incubated with a labeled duplex oligonucleotide containing either a synthetic AP site opposite guanine (AP:G) or a C:G pair. After non-denaturing gel electrophoresis, protein-DNA complexes were identified by their retarded mobility compared with that of free DNA, as indicated. (C) Electrophoretic mobility shift assay. D429A-290MBD4L purified protein (5 M) was incubated with a labeled duplex oligonucleotide containing either a synthetic AP site opposite guanine (AP:G), a C:G pair or a U:G mispair. After non-denaturing gel electrophoresis, protein-DNA complexes were identified as indicated. (D) Electrophoretic mobility shift assay. MBD4L (top panel) or 290MBD4L (center panel) purified proteins (5 M) were incubated with a labeled duplex oligonucleotide containing an AP:G mispair and increasing amounts of a non-labeled AP:G duplex as a competitor. After non-denaturing gel electrophoresis, protein-DNA complexes were identified by their retarded mobility compared with that of free DNA, as indicated, and quantified by fluorescence scanning. The bottom panel shows the relative decrease in protein-DNA complex observed for each protein in the presence of increasing competitor:probe ratios. Values are means ± SD (error bars) from two independent experiments.
truncated protein 290MBD4L displays the same substrate specificity as full-length MBD4L, with a closely similar preference for both target base and sequence context, but exhibits a higher catalytic activity. The reason for such increased activity is presently unclear. The N-terminal region is not required for product binding, and furthermore 290MBD4L exhibits a higher affinity for an AP site compared to FL-MBD4L. Therefore, the higher activity of
the truncated version is not due to increased catalytic turnover by facilitated product release. In mammalian MBD4, the spectrum of substrate is also determined by the catalytic domain of the protein [17] but, unlike the plant enzyme, both the full-length and the truncated versions display a similar catalytic efficiency [18]. The possible role of the N-terminal domain in plant MBD4L proteins is currently unknown. As mentioned above, this region is not
896
Á. Ramiro-Merina et al. / DNA Repair 12 (2013) 890–898
A
OH
HN O
N
5-hydroxyuracil 5-HU
NH2 N
OH O
N
5-hydroxymethyluracil 5-HmeU
8 6 4 2 0
OH N
5-hydroxymethylcytosine 5-HmeC
∆290MBD4L 100 MBD4L ∆290MBD4L
80 60 40 20
80 60 40 20 0
5Br U
C 8ox oG
T 5HU 5Br 5- C m 5- eC Hm eC
5- U Hm eU
5Br U 5FU
0
eU
% Processed substrate
100
U
B
5Hm
O
5-fluoruracil 5-FU
O
O HN
5-bromouracil 5-BrU
N
T
5-bromocytosine 5-BrC
O
N
10
5HU
O
N
F
HN
T
8-oxoguanine 8-oxoG
Br
HN
5HU
O
N
Br
N
12
eU
H N O
N
O
MBD4L 14
5FU
H2N
O
NH2
C
N
uracil U
U
HN
O
N
thymine T
5Hm
O
O
N
5-methylcytosine 5-meC
HN
5Br U
O
N
cytosine C
CH3
HN
Relative efficiency (nM/h)
O
O
O CH3
5FU
NH2 N
Relative efficiency (nM/h)
NH2 N
Fig. 5. Arabidopsis MBD4L excises uracil derivatives. (A) Chemical structures of substrate DNA bases tested in this study. (B) Purified full-length MBD4L (white bars) or 290MBD4L (gray bars) (1 M) were incubated at 30 ◦ C for 24 h with 51-mer double-stranded oligonucleotide substrates (40 nM) containing at position 29 of the labeled upper-strand different target DNA bases paired with G. Products were separated in a 12% denaturing polyacrylamide gel and the amount of incised oligonucleotide was quantified by fluorescent scanning. Values are means ± SD (error bars) from three independent experiments. (C) Substrate processing ability of full-length MBD4L (top panel) and 290MBD4L (bottom panel) on several uracil derivatives paired with G. Reactions were performed as indicated above and relative processing efficiencies were determined as described in Section 2. Values are means ± SD (error bars) from three independent experiments.
T:G mismatches but prefer 5-meC as substrate and are involved in active DNA demethylation [54–58]. Both the T and 5-meC excision activity of ROS1 and DME proteins is higher in a CpG context [54], which suggests that, in addition to initiating active DNA demethylation, they may be also involved in counteracting the potential mutagenic effects of 5-meC deamination. MBD4L is the only plant enzyme identified so far acting on both U:G and T:G mispairs. In this regard, it is significant that Arabidopsis MBD4L excises both U and T more efficiently at a CpG context, where most plant methylation occurs. In comparison, 6
Relative efficiency (nM/h)
conserved in plant MBD4L homologs (Supplemental Fig. 2), and the only discernible feature is the presence of one to several copies of a short peptide with a consensus sequence SPxh (where x is any amino acid and h a hydrophobic residue) [34]. It is possible that such repeats are targets for post-translational modifications involved in protein-protein interactions, since similar short phospho-motifs are recognized by BRCT (BRCA1 carboxyl-terminal) domains [50]. Like mammalian TDG and MBD4, Arabidopsis MBD4L is actually a uracil DNA glycosylase, and its activity on 5-substituted uracil derivatives depends on the nature of the substituent. Thus, the activity of the enzyme decreases when the group at C5 is CH3 , OH or CH2 OH, which are electron-donating groups, but increases significantly with electron-withdrawing substituents such as Br or F, which stabilize the transition state and enhance the leaving ability of the base. Therefore, as previously proposed for TDG [51,52], the substrate specificity of MBD4L may substantially depend on the stability of the scissile C N bond. However, selective recognition of the base at the active site must also play an important role since cytosine and its C5-derivatives are not recognized as substrates. Four different uracil DNA glycosylases have been identified in mammals: uracil DNA N-glycosylase (UNG), single-strandselective mono-functional uracil-DNA glycosylase 1 (SMUG1), TDG and MBD4. Among them, only TDG and MBD4 display an additional activity against T:G mismatches. Arabidopsis lack TDG or SMUG1 orthologs [34], but possesses a UNG homolog that is required for BER of uracil in vitro and is the major uracil DNA glycosylase detected in cell-free extracts [53]. In addition, the Arabidopsis genome encodes four members of the plant-specific ROS1/DME subfamily of HhH-GPD proteins [54,55]. These proteins, which do not show detectable activity on U:G mispairs, are able to process
CG 5 4
CHG (H=C) CHG (H=A) CHH
3 2 1 0
T
U
Fig. 6. Arabidopsis MBD4L has a preference for a CpG context. Purified full-length MBD4L (1 M), was incubated at 30 ◦ C with double-stranded oligonucleotide substrates (40 nM) containing either an U:G or a T:G mispair at different sequence contexts. Reaction products were separated in a 12% denaturing polyacrylamide gel and quantified by fluorescence scanning. Relative processing efficiencies were determined as described in Section 2. Values are means ± SD (error bars) from three independent experiments.
Á. Ramiro-Merina et al. / DNA Repair 12 (2013) 890–898
Arabidopsis UNG, like its mammalian homologs, lacks any preference for CpG contexts [53]. Such a weak, but consistent, preference for CpG sites is also exhibited by the catalytic domain of mammalian MBD4 [18,42], and argues in favor of a methylation-related function of both plant and animal enzymes. One can speculate that the ancestral version of MBD4 evolved to protect the stability of genome sequences frequently targeted for methylation, counteracting the mutagenic consequences of deamination events occurring either at the unmethylated or methylated state. Alternatively, the increased U excision observed at CpG sequences may be just be a passive consequence of a selection for preferential T excision at those same contexts. Within any of these scenarios, targeting to CpG sequences may have been later reinforced by addition of a MBD domain to the catalytic domain in the animal lineage [49], but not in plants, which also methylate non-CpG sequences. An alternative possibility is that MBD4L functions as a heterodimer with some of the MBD proteins present in plants. In any case, a full understanding of the in vivo role of Arabidopsis MBD4L will require the generation of null-alleles or RNAi silenced lines. Conflicts of interest statement The authors declare that they have no conflicts of interest. Acknowledgements We thank S. E. Bennett (Oregon State University) for the kind gift of the E. coli BL21 (DE3) ung-151 strain. We also thank members of our laboratory for helpful discussions and advice. This work was supported by the Spanish Ministry of Economy and Competitiveness and the European Regional Development Fund (grant number BFU2010-18838); Junta de Andalucía, Spain (grant number P07CVI-02770). A.R-M. was the recipient of a PhD Fellowship from Junta de Andalucía, Spain. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.dnarep. 2013.08.002. References [1] T. Lindahl, Instability and decay of the primary structure of DNA, Nature 362 (1993) 709–715. [2] T. Lindahl, B. Nyberg, Heat-induced deamination of cytosine residues in deoxyribonucleic acid, Biochemistry 13 (1974) 3405–3410. [3] J.C. Shen, W.M. Rideout 3rd, P.A. Jones, The rate of hydrolytic deamination of 5methylcytosine in double-stranded DNA, Nucleic Acids Res. 22 (1994) 972–976. [4] B.K. Duncan, J.H. Miller, Mutagenic deamination of cytosine residues in DNA, Nature 287 (1980) 560–561. [5] P. Fortini, E. Dogliotti, Base damage and single-strand break repair: mechanisms and functional significance of short- and long-patch repair subpathways, DNA Repair (Amst) 6 (2007) 398–409. [6] L. Aravind, E.V. Koonin, The a/b fold uracil DNA glycosylases: a common origin with diverse fates, Genome Biol. 1 (2000), research0007. 0001-0007.0008. [7] E.C. Friedberg, G.C. Walker, W. Siede, R.D. Wood, R.A. Schultz, T. Ellenberger, DNA Repair and Mutagenesis, 2nd ed., ASM Press, Washington, DC, 2006. [8] H.E. Krokan, F. Drablos, G. Slupphaug, Uracil in DNA – occurrence, consequences and repair, Oncogene 21 (2002) 8935–8948. [9] J.H. Chung, E.K. Im, H.Y. Park, J.H. Kwon, S. Lee, J. Oh, K.C. Hwang, J.H. Lee, Y. Jang, A novel uracil-DNA glycosylase family related to the helix-hairpin-helix DNA glycosylase superfamily, Nucleic Acids Res. 31 (2003) 2045–2055. [10] K. Wiebauer, J. Jiricny, In vitro correction of G.T. mispairs to G.C. pairs in nuclear extracts from human cells, Nature 339 (1989) 234–236. [11] R.Y. Wang, K.C. Kuo, C.W. Gehrke, L.H. Huang, M. Ehrlich, Heat- and alkaliinduced deamination of 5-methylcytosine and cytosine residues in DNA, Biochim. Biophys. Acta 697 (1982) 371–377. [12] P.A. Jones, W.M. Rideout 3rd, J.C. Shen, C.H. Spruck, Y.C. Tsai, Methylation, mutation and cancer, Bioessays 14 (1992) 33–36. [13] P. Neddermann, J. Jiricny, The purification of a mismatch-specific thymine-DNA glycosylase from HeLa cells, J. Biol. Chem. 268 (1993) 21218–21224.
897
[14] P. Neddermann, P. Gallinari, T. Lettieri, D. Schmid, O. Truong, J.J. Hsuan, K. Wiebauer, J. Jiricny, Cloning and expression of human G/T mismatch-specific thymine-DNA glycosylase, J. Biol. Chem. 271 (1996) 12767–12774. [15] B. Hendrich, U. Hardeland, H.H. Ng, J. Jiricny, A. Bird, The thymine glycosylase MBD4 can bind to the product of deamination at methylated CpG sites, Nature 401 (1999) 301–304. [16] A. Bellacosa, L. Cicchillitti, F. Schepis, A. Riccio, A.T. Yeung, Y. Matsumoto, E.A. Golemis, M. Genuardi, G. Neri, MED1, a novel human methyl-CpG-binding endonuclease, interacts with DNA mismatch repair protein MLH1, Proc. Natl. Acad. Sci. U. S. A. 96 (1999) 3969–3974. [17] F. Petronzelli, A. Riccio, G.D. Markham, S.H. Seeholzer, M. Genuardi, M. Karbowski, A.T. Yeung, Y. Matsumoto, A. Bellacosa, Investigation of the substrate spectrum of the human mismatch-specific DNA N-glycosylase MED1 (MBD4): fundamental role of the catalytic domain, J. Cell. Physiol. 185 (2000) 473–480. [18] F. Petronzelli, A. Riccio, G.D. Markham, S.H. Seeholzer, J. Stoerker, M. Genuardi, A.T. Yeung, Y. Matsumoto, A. Bellacosa, Biphasic kinetics of the human DNA repair protein MED1 (MBD4), a mismatch-specific DNA N-glycosylase, J. Biol. Chem. 275 (2000) 32422–32429. [19] U. Hardeland, M. Bentele, J. Jiricny, P. Schar, The versatile thymine DNAglycosylase: a comparative characterization of the human, Drosophila and fission yeast orthologs, Nucleic Acids Res. 31 (2003) 2261–2271. [20] D.P. Turner, S. Cortellino, J.E. Schupp, E. Caretti, T. Loh, T.J. Kinsella, A. Bellacosa, The DNA N-glycosylase MED1 exhibits preference for halogenated pyrimidines and is involved in the cytotoxicity of 5-iododeoxyuridine, Cancer Res. 66 (2006) 7686–7693. [21] B. Hendrich, A. Bird, Identification and characterization of a family of mammalian methyl-CpG binding proteins, Mol. Cell. Biol. 18 (1998) 6538–6547. [22] H.M. Nash, S.D. Bruner, O.D. Scharer, T. Kawate, T.A. Addona, E. Spooner, W.S. Lane, G.L. Verdine, Cloning of a yeast 8-oxoguanine DNA glycosylase reveals the existence of a base-excision DNA-repair protein superfamily, Curr. Biol. 6 (1996) 968–980. [23] V. Valinluck, P. Liu, J.I. Kang Jr., A. Burdzy, L.C. Sowers, 5-Halogenated pyrimidine lesions within a CpG sequence context mimic 5-methylcytosine by enhancing the binding of the methyl-CpG-binding domain of methyl-CpG-binding protein 2 (MeCP2), Nucleic Acids Res. 33 (2005) 3057–3064. [24] B. Zhu, Y. Zheng, H. Angliker, S. Schwarz, S. Thiry, M. Siegmann, J.P. Jost, 5Methylcytosine DNA glycosylase activity is also present in the human MBD4 (G/T mismatch glycosylase) and in a related avian sequence, Nucleic Acids Res. 28 (2000) 4157–4165. [25] H. Hashimoto, Y. Liu, A.K. Upadhyay, Y. Chang, S.B. Howerton, P.M. Vertino, X. Zhang, X. Cheng, Recognition and potential mechanisms for replication and erasure of cytosine hydroxymethylation, Nucleic Acids Res. 40 (2012) 4841–4849. [26] H. Hashimoto, X. Zhang, X. Cheng, Excision of thymine and 5hydroxymethyluracil by the MBD4 DNA glycosylase domain: structural basis and implications for active DNA demethylation, Nucleic Acids Res. 40 (2012) 8276–8284. [27] S. Morera, I. Grin, A. Vigouroux, S. Couve, V. Henriot, M. Saparbaev, A.A. Ishchenko, Biochemical and structural characterization of the glycosylase domain of MBD4 bound to thymine and 5-hydroxymethyuracil-containing DNA, Nucleic Acids Res. 40 (2012) 9917–9926. [28] J.U. Guo, Y. Su, C. Zhong, G.L. Ming, H. Song, Hydroxylation of 5-methylcytosine by TET1 promotes active DNA demethylation in the adult brain, Cell 145 (2011) 423–434. [29] C.B. Millar, J. Guy, O.J. Sansom, J. Selfridge, E. MacDougall, B. Hendrich, P.D. Keightley, S.M. Bishop, A.R. Clarke, A. Bird, Enhanced CpG mutability and tumorigenesis in MBD4-deficient mice, Science 297 (2002) 403–405. [30] E. Wong, K. Yang, M. Kuraguchi, U. Werling, E. Avdievich, K. Fan, M. Fazzari, B. Jin, A.M. Brown, M. Lipkin, W. Edelmann, Mbd4 inactivation increases C → T transition mutations and promotes gastrointestinal tumor formation, Proc. Natl. Acad. Sci. U. S. A. 99 (2002) 14937–14942. [31] S. Bader, M. Walker, D. Harrison, Most microsatellite unstable sporadic colorectal carcinomas carry MBD4 mutations, Br. J. Cancer 83 (2000) 1646–1649. [32] A. Riccio, L.A. Aaltonen, A.K. Godwin, A. Loukola, A. Percesepe, R. Salovaara, V. Masciullo, M. Genuardi, M. Paravatou-Petsotas, D.E. Bassi, B.A. Ruggeri, A.J. Klein-Szanto, J.R. Testa, G. Neri, A. Bellacosa, The DNA repair gene MBD4 (MED1) is mutated in human carcinomas with microsatellite instability, Nat. Genet. 23 (1999) 266–268. [33] S. Cortellino, D. Turner, V. Masciullo, F. Schepis, D. Albino, R. Daniel, A.M. Skalka, N.J. Meropol, C. Alberti, L. Larue, A. Bellacosa, The base excision repair enzyme MED1 mediates DNA damage response to antitumor drugs and is associated with mismatch repair system integrity, Proc. Natl. Acad. Sci. U. S. A. 100 (2003) 15071–15076. [34] L.M. Iyer, S. Abhiman, L. Aravind, Natural history of eukaryotic DNA methylation systems, Prog. Mol. Biol. Transl. Sci. 101 (2011) 25–104. [35] A. Zemach, G. Grafi, Characterization of Arabidopsis thaliana methyl-CpGbinding domain (MBD) proteins, Plant J. 34 (2003) 565–572. [36] A. Zemach, G. Grafi, Methyl-CpG-binding domain proteins in plants: interpreters of DNA methylation, Trends Plant Sci. 12 (2007) 80–85. [37] S.F. Altschul, T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller, D.J. Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25 (1997) 3389–3402. [38] S. Moretti, F. Armougom, I.M. Wallace, D.G. Higgins, C.V. Jongeneel, C. Notredame, The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods, Nucleic Acids Res. 35 (2007) W645–W648.
898
Á. Ramiro-Merina et al. / DNA Repair 12 (2013) 890–898
[39] E.M. Zdobnov, R. Apweiler, InterProScan – an integration platform for the signature-recognition methods in InterPro, Bioinformatics 17 (2001) 847–848. [40] J.J. Ward, J.S. Sodhi, L.J. McGuffin, B.F. Buxton, D.T. Jones, Prediction and functional analysis of native disorder in proteins from the three kingdoms of life, J. Mol. Biol. 337 (2004) 635–645. [41] T. Schwede, J. Kopp, N. Guex, M.C. Peitsch, SWISS-MODEL: an automated protein homology-modeling server, Nucleic Acids Res. 31 (2003) 3381–3385. [42] B.A. Manvilla, A. Maiti, M.C. Begley, E.A. Toth, A.C. Drohat, Crystal structure of human methyl-binding domain IV glycosylase bound to abasic DNA, J. Mol. Biol. 420 (2012) 164–175. [43] S.E. Bennett, C.Y. Chen, D.W. Mosbaugh, Escherichia coli nucleoside diphosphate kinase does not act as a uracil-processing DNA repair nuclease, Proc. Natl. Acad. Sci. U. S. A. 101 (2004) 6391–6396. [44] U. Hardeland, M. Bentele, J. Jiricny, P. Schar, Separating substrate recognition from base hydrolysis in human thymine DNA glycosylase by mutational analysis, J. Biol. Chem. 275 (2000) 33449–33456. [45] M. Seki, M. Narusaka, A. Kamiya, J. Ishida, M. Satou, T. Sakurai, M. Nakajima, A. Enju, K. Akiyama, Y. Oono, M. Muramatsu, Y. Hayashizaki, J. Kawai, P. Carninci, M. Itoh, Y. Ishii, T. Arakawa, K. Shibata, A. Shinagawa, K. Shinozaki, Functional annotation of a full-length Arabidopsis cDNA collection, Science 296 (2002) 141–145. [46] K. Yamada, J. Lim, J.M. Dale, H. Chen, P. Shinn, C.J. Palm, A.M. Southwick, H.C. Wu, C. Kim, M. Nguyen, P. Pham, R. Cheuk, G. Karlin-Newmann, S.X. Liu, B. Lam, H. Sakano, T. Wu, G. Yu, M. Miranda, H.L. Quach, M. Tripp, C.H. Chang, J.M. Lee, M. Toriumi, M.M. Chan, C.C. Tang, C.S. Onodera, J.M. Deng, K. Akiyama, Y. Ansari, T. Arakawa, J. Banh, F. Banno, L. Bowser, S. Brooks, P. Carninci, Q. Chao, N. Choy, A. Enju, A.D. Goldsmith, M. Gurjal, N.F. Hansen, Y. Hayashizaki, C. Johnson-Hopson, V.W. Hsuan, K. Iida, M. Karnes, S. Khan, E. Koesema, J. Ishida, P.X. Jiang, T. Jones, J. Kawai, A. Kamiya, C. Meyers, M. Nakajima, M. Narusaka, M. Seki, T. Sakurai, M. Satou, R. Tamse, M. Vaysberg, E.K. Wallender, C. Wong, Y. Yamamura, S. Yuan, K. Shinozaki, R.W. Davis, A. Theologis, J.R. Ecker, Empirical analysis of transcriptional activity in the Arabidopsis genome, Science 302 (2003) 842–846. [47] R. Lister, R.C. O’Malley, J. Tonti-Filippini, B.D. Gregory, C.C. Berry, A.H. Millar, J.R. Ecker, Highly integrated single-base resolution maps of the epigenome in Arabidopsis, Cell 133 (2008) 523–536.
[48] S.J. Cokus, S. Feng, X. Zhang, Z. Chen, B. Merriman, C.D. Haudenschild, S. Pradhan, S.F. Nelson, M. Pellegrini, S.E. Jacobsen, Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning, Nature 452 (2008) 215–219. [49] B. Hendrich, S. Tweedie, The methyl-CpG binding domain and the evolving role of DNA methylation in animals, Trends Genet. 19 (2003) 269–277. [50] I.A. Manke, D.M. Lowery, A. Nguyen, M.B. Yaffe, BRCT repeats as phosphopeptide-binding modules involved in protein targeting, Science 302 (2003) 636–639. [51] M.T. Bennett, M.T. Rodgers, A.S. Hebert, L.E. Ruslander, L. Eisele, A.C. Drohat, Specificity of human thymine DNA glycosylase depends on N-glycosidic bond stability, J. Am. Chem. Soc. 128 (2006) 12510–12519. [52] M.T. Morgan, M.T. Bennett, A.C. Drohat, Excision of 5-halogenated uracils by human thymine DNA glycosylase. Robust activity for DNA contexts other than CpG, J. Biol. Chem. 282 (2007) 27578–27586. ˜ [53] D. Córdoba-Canero, E. Dubois, R.R. Ariza, M.P. Doutriaux, T. Roldan-Arjona, Arabidopsis uracil DNA glycosylase (UNG) is required for base excision repair of uracil and increases plant sensitivity to 5-fluorouracil, J. Biol. Chem. 285 (2010) 7475–7483. [54] T. Morales-Ruiz, A.P. Ortega-Galisteo, M.I. Ponferrada-Marin, M.I. MartinezMacias, R.R. Ariza, T. Roldan-Arjona, DEMETER and REPRESSOR OF SILENCING 1 encode 5-methylcytosine DNA glycosylases, Proc. Natl. Acad. Sci. U. S. A. 103 (2006) 6853–6858. [55] A.P. Ortega-Galisteo, T. Morales-Ruiz, R.R. Ariza, T. Roldan-Arjona, Arabidopsis DEMETER-LIKE proteins DML2 and DML3 are required for appropriate distribution of DNA methylation marks, Plant Mol. Biol. 67 (2008) 671–681. [56] Z. Gong, T. Morales-Ruiz, R.R. Ariza, T. Roldan-Arjona, L. David, J.K. Zhu, ROS1, a repressor of transcriptional gene silencing in Arabidopsis, encodes a DNA glycosylase/lyase, Cell 111 (2002) 803–814. [57] M. Gehring, J.H. Huh, T.F. Hsieh, J. Penterman, Y. Choi, J.J. Harada, R.B. Goldberg, R.L. Fischer, DEMETER DNA glycosylase establishes MEDEA polycomb gene selfimprinting by allele-specific demethylation, Cell 124 (2006) 495–506. [58] J. Penterman, D. Zilberman, J.H. Huh, T. Ballinger, S. Henikoff, R.L. Fischer, DNA demethylation in the Arabidopsis genome, Proc. Natl. Acad. Sci. U. S. A. 104 (2007) 6752–6757.