The First Structure of an Active Mammalian dCTPase and its Complexes With Substrate Analogs and Products

The First Structure of an Active Mammalian dCTPase and its Complexes With Substrate Analogs and Products

Journal Pre-proof The first structure of an active mammalian dCTPase and its complexes with substrate analogues and products Emma Scaletti, Magnus Cla...

20MB Sizes 0 Downloads 24 Views

Journal Pre-proof The first structure of an active mammalian dCTPase and its complexes with substrate analogues and products Emma Scaletti, Magnus Claesson, Thomas Helleday, Ann-Sofie Jemth, Pål Stenmark PII:

S0022-2836(20)30024-3

DOI:

https://doi.org/10.1016/j.jmb.2020.01.005

Reference:

YJMBI 66389

To appear in:

Journal of Molecular Biology

Received Date: 9 September 2019 Revised Date:

30 December 2019

Accepted Date: 3 January 2020

Please cite this article as: E. Scaletti, M. Claesson, T. Helleday, A.-S. Jemth, P. Stenmark, The first structure of an active mammalian dCTPase and its complexes with substrate analogues and products, Journal of Molecular Biology, https://doi.org/10.1016/j.jmb.2020.01.005. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Elsevier Ltd. All rights reserved.

CREDIT AUTHOR STATEMENT Emma Scaletti: Investigation, Visualization, Writing- Original draft preparation. Magnus Claesson: Investigation. Ann-Sofie Jemth: Investigation, Visualization, Writing- Review and Editing. Pål Stenmark: Supervision, Conceptualization, Funding acquisition. Thomas Helleday: Supervision, Conceptualization, Funding acquisition.

The first structure of an active mammalian dCTPase and its complexes with substrate analogues and products

Emma Scaletti1,2*, Magnus Claesson2*, Thomas Helleday3,4, Ann-Sofie Jemth3, and Pål Stenmark1,2

1

2

Department of Experimental Medical Science, Lund University, Lund 221 00, Sweden.

Department of Biochemistry and Biophysics, Stockholm University, S-106 91, Stockholm, Sweden.

3

Science for Life Laboratory, Department of Oncology-Pathology, Karolinska Institutet, S171 76 Stockholm, Sweden.

4

Sheffield Cancer Centre, Department of Oncology and Metabolism, University of Sheffield, Sheffield, S10 2RX Sheffield, UK.

Correspondence should be addressed to Pål Stenmark, e-mail: [email protected] Tel: +468163729 or Ann-Sofie Jemth, e-mail: [email protected], Tel: +46768292927.

* These authors contributed equally to this work

ABSTRACT Precise regulation of dNTPs within the cellular nucleotide pool is essential for high accuracy of DNA replication and is critical for retaining the genomic integrity. Recently, human dCTPase (deoxycytidine triphosphatase), also known as DCTPP1 (human all-alpha dCTP pyrophosphatase 1) has been revealed to be a key player in the balance of pyrimidine nucleotide concentrations within cells, with DCTPP1 deficiency causing DNA damage and 1

genetic instability in both chromosomal and mitochondrial DNA. DCTPP1 also exhibits an additional ‘house cleaning’ function as it has been shown to be highly active against modified cytidine trinucleotides such as 5-methyl-dCTP, which if incorrectly incorporated into DNA can introduce undesirable epigenetic marking. To date, structural studies of mammalian dCTPase have been limited to inactive constructs, which do not provide information regarding the catalytic mechanism of this important enzyme. We here present the first structures of an active mammalian dCTPase from M. musculus in complex with the nonhydrolysable substrate analogue dCMPNPP and the products 5-Me-dCMP and dCMP. These structures provide clear insights into substrate binding and catalysis, and clearly elucidate why previous structures of mammalian dCTPase were catalytically inactive. The overall structure of M. musculus dCTPase is highly similar to enzymes from the all-alpha NTP phosphohydrolase superfamily. Comparison of M. musculus dCTPase with homologues from a diverse range of mammals including humans, show that the residues which contribute to substrate recognition are entirely conserved, further supporting the importance of this enzyme in the protection of genomic integrity in mammalian cells.

HIGHLIGHTS •

The first structures of an active mammalian dCTPase.



The structure of M. musculus dCTPase in complex with dCMPNPP provides clear insights into substrate binding and catalysis.



The overall M. musculus dCTPase structure is highly similar to proteins from the allalpha phosphohydrolase superfamily.



Comparison of dCTPase sequences from a diverse range of mammals, indicates that the residues which contribute to substrate recognition are entirely conserved.

2

KEYWORDS: dCTPase, DCTPP1, 5-Me-dCMP, nucleotide pool regulation, epigenetics ABBREVIATIONS: dCTPase, deoxycytidine triphosphatase; DCTPP1, human all-alpha dCTP pyrophosphatase 1; dCMPNPP, 2'-deoxycytidine-5'-[(α,β)-imido]triphosphate; dNTPs, deoxynucleotide triphosphates.

INTRODUCTION The preservation of genomic integrity of both nuclear and mitochondrial genomes requires tight regulation of the concentrations of deoxynucleotide triphosphates (dNTPs) within cellular nucleotide pools [1, 2], with imbalances introducing errors in DNA replication, chromosomal abnormalities and cell death [2-4]. Key enzymes in nucleotide pool homeostasis include ribonucleotide reductase and SAMHD1 [5, 6]. Recently, the enzyme human dCTPase, also known as DCTPP1 or XTP3-transactivated protein A, was shown to play an important role in maintaining the balance of pyrimidine nucleotides (dCTP, dUTP and dTTP) within cells [7]. Mammalian dCTPase belongs to the all-alpha NTP pyrophosphohydrolase superfamily which also includes dimeric dUTPases that hydrolyze dUTP and dUDP [8-12], MazG which hydrolyzes several deoxy- and ribonucleotides including ATP, GTP, CTP, UTP, dATP, dCTP and dTTP [13, 14], phage T4 dCTPase which hydrolyzes dUTP, dUDP, dCTP and dCDP [15-17] and phosophoribosyl-ATP pyrophosphatase HisE which is involved in histidine biosynthesis [18, 19]. DCTPP1 is dependent of magnesium ions for catalysis and preferentially hydrolyses dCTP to inorganic pyrophosphate and dCMP, the latter of which is required for thymidylate synthesis [20, 21]. dTTP, dATP and dUTP are much poorer substrates compared to dCTP and DCTPP1 has no activity towards dGTP, dCDP, UTP or ATP. Furthermore, the enzyme has very poor activity towards CTP [20]. The depletion of DCTPP1 has been shown to result in a larger pool of dCTP, the consequence of which is

3

reduced levels of dTTP and a severe alteration of the cellular dUTP/dTTP ratio. DCTPP1 deficient cells are thus more sensitive to uracil misincorporation and exhibit an activated DNA damage response, alterations in cell cycle progression and a mutator phenotype that negatively impacts both chromosomal and mitochondrial DNA [7].

In addition to the important role played by DCTPP1 in maintaining the balance of pyrimidine nucleotide pools, the enzyme also performs an important ‘house-cleaning’ function in the cell. The free nucleotide pool is highly susceptible to unwanted chemical modifications produced as by-products of cellular metabolism [22, 23]. Interestingly, DCTPP1 displays high activity towards dCTP modified in the 5 position, such as 5-Methyl-dCTP, 5-I-dCTP and 5-Br-dCTP [20, 21, 24]. Hydrolysis of methylated dCTP to its monophosphate form by DCTPP1 prevents its incorporation into DNA, which prevents mutations and/or changes in DNA methylation pattern [20, 25]. In addition to dCTPase, the removal of potentially mutagenic nucleotides from the nucleotide pool in humans is achieved through the concerted effort of a number of other enzymes [26, 27]; the Nudix pyrophosphohydrolase MTH1 which is highly active against 8-oxo-dGTP, 2OH-dATP and O6-methyl-dGTP [28, 29], all-beta dUTPases which prevent the accumulation of dUTP thus helping to maintain DNA replication fidelity [30], ITPases that hydrolyze the non-canonical nucleotides dITP and dXTP which are produced from defects in purine synthesis and deamination reactions [31-33] and the non-NTPpyrophosphatase SAMHD1, which hydrolyzes canonical and non-nucleotide substrates such as O6-methyldGTP, 5- methyldCTP and 2-thiodTTP to their nucleotide base [34, 35]. DCTPP1 has been shown to be over-expressed in highly proliferative tissues with an expanded nucleotide pool and unsurprisingly is overexpressed also in a variety of human cancers and stem cells [25, 36, 37], where there is a pronounced need for DNA precursors and nucleotide pool homeostasis. Recently, DCTPP1 has also been identified as a novel positive

4

regulator of the Wnt signalling pathway, the hyperactivation of which is associated with carcinogenesis [38].

Previous structural studies of mammalian dCTPase have involved inactive constructs, which unfortunately do not allow insights into the catalytic mechanism of this highly relevant enzyme. These structures were produced from a truncated (aa 21-126) and catalytically inactive form of M. musculus dCTPase (RSCUT) and includes an apo structure and a complex with 5Me-dCTP, both of which were solved in the absence of magnesium ions [24]. Here we present the first structures of an active mammalian dCTPase from M. musculus in complex with the substrate analogue dCMPNPP, and the products dCMP and 5-Me-dCMP. Our structures, solved in complex with the catalytically essential magnesium ions, provide clear insights into substrate binding and catalysis. Furthermore, it is evident from our structure in complex with dCMPNPP why previous structures of the enzyme were indeed catalytically inactive. The overall M. musculus dCTPase structure is highly similar to proteins from the allalpha NTP phosphohydrolase superfamily. Sequence comparisons of mouse dCTPase with the sequences from a wide variety of mammals including humans, confirm that the residues which contribute to substrate recognition are entirely conserved, highlighting the importance of this enzyme in the protection of genomic integrity in mammalian cells.

RESULTS Overall Structure of M. musculus dCTPase The structures of Mm_dCTPase in complex with the substrate analogue dCMPNPP and the products dCMP and 5Me-dCMP were solved in the space group P41212 to 1.90, 1.90 and 1.80 Å resolution, respectively. Data collection and refinement statistics are presented in Table 1. The tertiary structure of Mm_dCTPase is a D2 symmetric tetramer, which is comprised of

5

two C2 symmetric dimers (Figure 1). The individual Mm_dCTPase monomer has an all αhelical structure and consists of four α-helices (α-1 to α-4) and two 310-helices (η1 and η2) located between α-1 and α-2 and α-2 and α-3, respectively (Figure 1A). There are two identical active sites per dimer, which are comprised of residues from both monomers (Figure 1B). The Mm_dCTPase tetramer is formed by two homodimers interacting through their respective α-helix 2 and its associated loop regions assembling the central helix bundle (Figure 1C). This oligomeric state is consistent with previous reports from ultracentrifugation studies, which have shown the enzyme to have this tertiary structure in solution [24]. Analysis of the Mm_dCTPase structure with PISA (Protein Interfaces, Surfaces and Assemblies) from the EBI Web server [39] shows an average surface area of 9098 Å2 for the individual monomer and a buried interface area of 2050 Å2. There are 74 residues from each monomer that contribute to the dimer interface. Analysis of the dimer-dimer interface using PISA shows a buried interface area of 1089 Å2. The dimer-dimer interface symmetrical and is formed by the same 50 residues from both dimers.

dCMPNPP Substrate Recognition by M. musculus dCTPase The Mm_dCTPase structure in complex with the substrate analogue dCMPNPP has clear electron density for the ligand (Figure 2A). This modified nucleotide is a non-hydrolysable form of dCTP, in which the α- and β-phosphates are linked by a nitrogen atom, rather than an oxygen atom as is present in the canonical substrate. Overall, the dimer formed by monomers C and D has slightly stronger electron density than the dimer formed by monomers A and B. Descriptions regarding ligand binding in this manuscript therefore refer exclusively to monomers C and D. There are two dCMPNPP ligands per dimer, each of which coordinate two catalytically essential magnesium ions (Figure 1B). In each active site, the two magnesium ions (M1 and M2) exhibit ideal octahedral coordination involving the dCMPNPP

6

α- and β-phosphate groups, protein side chains, and ordered water molecules. In particular, M1 hydrogen bonds with the side chains of Glu63(D), Glu66(D), Glu95(D) and Asp98(D), one water molecule (w13) and an oxygen atom from the α-phosphate of the nucleotide. The second magnesium ion M2 is coordinated by the side chains of Glu63(D) and Glu66(D), one water molecule (w13) and an oxygen atom from both the α- and β-phosphates of dCMPNPP (Figure 2B). The triphosphate moiety is further positioned by interactions with residues all from the second monomer of the dimer. This involves hydrogen bond interactions between the α-phosphate and Tyr129(C), the β-phosphate and Lys121(C) and the γ-phosphate and Arg128(C) (Figure 2B). The deoxyribose moiety of dCMPNPP is positioned by hydrogen bonding between the O3’ atom and the side chains of Asp98(D) and Asn125(C), in addition to a C−H⋅⋅⋅π interaction with the side chain of Tyr102(D) (Figure 2A). The cytosine base of the nucleotide makes two hydrogen bond interactions with His38(D) and His51(D) and is supported by a number of hydrophobic interactions involving Ile101(D), Trp47(D) and Trp73(A) (Figure 2A). Therefore, each active site is comprised by the residues of single dimer and a single residue (Trp73) from the neighbouring dimer of the Mm_dCTPase tetramer. It has previously been shown that tetramer formation is essential for substrate binding and activity of Mm_dCTPase [24] which is in agreement with our structural data.

Kinetics of M. musculus dCTPase with dCTP and 5Me-dCTP The Mm_dCTPase construct from which the structures were determined in this work contained both N- and C-terminal truncations (aa 22-134). To verify that this construct was in fact active, the enzyme was assayed against various concentrations of dCTP and 5Me-dCTP, following which kinetic constants were determined (Figure 3). The KM value of Mm_dCTPase (22-134) was significantly lower for 5Me-dCTP (156 µM) compared to dCTP (411 µM), indicating that the modified cytosine nucleotide binds the enzyme with higher

7

affinity. This contrasts with previous reports by Nonaka et al. from the full length enzyme (RS21-C6), which showed very little difference between the KM values for 5Me-dCTP and dCTP (48.5 and 44 µM, respectively)[21]. Our KM value of 411 µM for dCTP is more similar to a study performed by Wu et al. reporting a KM value of 270 µM for RS21-C6 [24]. The KM values reported here for Mm_dCTPase (22-134) suggest that the N- and C-terminal truncations may affect the binding of unmodified dCTP more severely than 5-methyl-dCTP. Truncated Mm_dCTPase (22-134) was shown to have a similar turnover of 5-methyl-dCTP compared to dCTP as evidenced by their associated kcat values of 0.04 and 0.032 s-1, respectively. The kcat/KM values, representing the catalytic efficiency of a reaction, indicated a 3.5-fold higher catalytic efficiency for 5Me-dCTP hydrolysis compared to dCTP (Figure 3B) showing a preference for 5Me-dCTP over dCTP as has been shown for human dCTPase [25]. Thus, the effect of the truncations are larger on the KM values compared to the kcat values indicating that substrate binding is more affected by the truncations than the actual hydrolysis reaction. Comparison of the kcat/KM values in that study indicated a 1.1-fold higher catalytic efficiency of dCTP compared to 5Me-dCTP, indicating the full length enzyme does not display a preference for dCTP [21]. The turnover of substrate is much lower than what has been previously shown for RS21-C6 by Nonaka et al. who reported kcat values for dCTP and 5Me-dCTP of 1.37 and 1.33 s-1, respectively [21]. The lower activity of Mm_dCTPase_FL presented here compared to the values presented by Nonaka et al. is likely to at least in part be due to differences in assay temperature where we used 22 °C while Nonaka et al. used 30 °C and may also be due to kcat values by us being calculated per monomer and not per tetramer.

In order to determine the effect the regions removed from our Mm_dCTPase (22-134) construct had on the enzyme activity we purified and performed the same kinetic analysis for

8

full length mouse dCTPase (Mm_dCTPase_FL) (Figure 3). Mm_dCTPase (22-134) displayed a 4-fold lower kcat value for dCTP and a 2-fold lower kcat for 5-methyl-dCTP, compared to Mm_dCTPase_FL (Figure 3B). The KM value of truncated Mm_dCTPase (22-134) is 6.7-fold higher for dCTP and 7.3-fold higher for 5-methyl-dCTP compared to Mm_dCTPase_FL. The kcat/KM-value is 26 times higher for full-length Mm_dCTPase towards dCTP and 16 times higher towards 5-methyl-dCTP compared to Mm_dCTPas (22-134) (Figure 3B). This indicates the regions removed in the truncated enzyme play a significant role in substrate binding and turnover. Overall, it is evident that the truncations in the Mm_dCTPase (22-134) construct have a significant negative effect on enzymatic activity and substrate binding. These truncated regions may be important for effective tetramerization and/or product release in mammalian dCTPase.

dCMP and 5Me-dCMP Product Recognition by M. musculus dCTPase The structure of Mm_dCTPase were also solved in complex with the reaction products dCMP and 5Me-dCMP. Both structures contain clear electron density for their respective ligands within the active site binding pocket (Figure 4). Each structure contains two nucleotides per dimer and each coordinating one magnesium ion. However, in contrast to Mm_dCTPasedCMPNPP, the complex structures with either dCMP or 5Me-dCMP only coordinate a single magnesium ion (M1). In the Mm_dCTPase-dCMP active site M1 exhibits ideal octahedral coordination involving the residues Glu63(D), Glu66(D), Glu95(D) and Asp98(D) and two water molecules (Figure 4A). The phosphate group of dCMP is stabilized by hydrogen bond interactions with Arg128(C) and Tyr129(C), in addition to six water molecules. The deoxyribose moiety of dCMP is positioned by hydrogen bond interactions between O3’ and the residues Asp98(D) and Asn125(C), as well as a C−H⋅⋅⋅π interaction with Tyr102(D) (Figure 4A). The cytosine base of dCMP hydrogen bonds with the residues His38(D) and

9

His51(D), and is surrounded by hydrophobic interactions involving Trp47(D), Ile101(D), and Trp73(A).

In the Mm_dCTPase-5Me-dCMP structure the single active site magnesium ion displays ideal octahedral coordination with Glu63(D), Glu66(D), Glu95(D) and Asp98(D) and two water molecules (Figure 4B), identical to what was observed for the dCMP bound enzyme. The hydrogen bond and hydrophobic interactions involved in positioning the deoxyribose moiety and base of 5Me-dCMP are also the same as those observed in the Mm_dCTPase-dCMP structure (Figure 4B). Cα-atom superposition of the dCMP and 5Me-dCMP bound dCTPase shows that the two nucleotides overlay perfectly within the active site (Figure 4C). The majority of residues which coordinate the ligand, including those which interact with the catalytic magnesium ion, also superimpose very well. There is a slight movement of Tyr129(C) which interacts with the phosphate group of both ligands, and Trp73(A) which in the 5Me-dCMP bound structure is positioned slightly closer to the base than what is observed for the dCMP complex (Figure 4C). Comparison of Mm_dCTPase-5Me-dCMP with the CMPNPP bound enzyme also shows the nucleotides and surrounding residues to superimpose well. The positions of Tyr129(C) and Trp73(A) display slight differences in their positioning, as also is observed when comparing the 5Me-dCMP and dCMP complexes (Figure 4D). The residue Arg128(D) on the other hand, occupies completely different positions in the two structures. In the dCMPNPP bound enzyme Arg128(D) interacts with two oxygens from the γ-phosphate of the nucleotide. The residue then flips position to hydrogen bond with a single oxygen from the monophosphate group of dCMP, which is equivalent to the α-phosphate of dCMPNPP (Figure 4D).

Structural comparisons of active and inactive M. musculus dCTPases

10

Prior to this study, the only existing structural information concerning mammalian dCTPases involved a truncated (aa 21-126) inactive form of the mouse enzyme, referred to as ‘RSCUT’ [24]. These structures were solved in the presence and absence of 5Me-dCTP, both in the absence of the catalytically required magnesium ions. In order to explain the lack of activity associated with previous structural studies of the enzyme, we compared our active dCMPNPP bound Mm_dCTPase structure with apo (PDB id: 2oie) and 5Me-dCTP bound (PDB id: 2oig) RSCUT (Figure 5). Superposition of active Mm_dCTPase-dCMPNPP with the inactive RSCUT indicated that no major conformational changes are induced in the active site pocket upon substrate binding (Figure 5A). Comparisons of active Mm_dCTPase-dCMPNPP with 5Me-dCTP bound RSCUT also showed the active site residues to superimpose well, however there were extreme differences in the positions of the bound substrates (Figure 5B). In the case of RSCUT, 5Me-dCTP is positioned significantly further into the binding pocket so that the residues His38 and His51 are much closer to the base (distances of 2.38 and 1.65 Å, respectively) than they are in the active dCMPNPP bound structure (distances of 2.9 and 3.0 Å, respectively). The phosphate groups of the substrates occupy dramatically different positions (Figure 5B). The residues Glu63, Glu66, Glu95 and Asp98, shown to coordinate two magnesium ions in the structure of active enzyme in complex with dCMPNPP (Figure 2B) are also observed in both RSCUT structures. The RSCUT construct should therefore be capable of binding metal ions that aid in the coordination of the triphosphate moiety. However, the residues Arg128 and Tyr129 which the help position the triphosphate group in the active dCMPNPP bound enzyme do not exist in the RSCUT construct (Figure 5). In addition, residue Asn125 (which interacts with the deoxyribose moiety of dCMPNPP in the active enzyme) was not observed in the RSCUT structures, likely owing to its position at the extreme C-terminal end of the protein (Figure 5B). Evidently, the residues Arg128 and

11

Tyr129 in combination with the active site magnesium ions are critical for the correct positioning of the triphosphate group for catalysis.

Overall structural comparisons of Mm_dCTPase with related enzymes from the all-α α NTP pyrophosphatase superfamily. A structural similarity search was performed using the DALI web server [40], which indicated that Mm_dCTPase was most structurally similar to other enzymes from the all-alpha helical superfamily including dimeric dUTPases which hydrolyze dUTP and dUDP [9, 41-44] and MazG whch hydrolyzes several deox- and ribonucleotides including ATP, GTP, CTP, UTP, dATP, dCTP and dTTP [45-47]. We compared our Mm_dCTPase structure to dimeric dUTPases from T. cruzi, L. major and C. jejuni, and MazG/MazG-like proteins from E. coli, S. solfataricus or D. radiodurans. Interestingly, while Mm_dCTPase displays low overall sequence identity with these enzymes, the EXXE_EXXD motif (Glu63, Glu66 and Glu95, Mm_dCTPase numbering) is entirely conserved (Figure 6A). These highly conserved residues are responsible for the coordination of metal ions required for substrate hydrolysis. The core helices of the individual Mm_dCTPase monomer superimposes well with the core structures of L. major, T. cruzi and C. jejuni dUTPase (Figure 6B), S. solfataricus MazG and D. radiodurans MazG-like protein (Figure 6C) and the individual N- and C-terminal domains of E. coli MazG (Figure 6D). Furthermore, the aforementioned glutamate and aspartate residues from these structures and their coordinating metal ions (when present) superimpose very well with those of Mm_dCTPase (Figure6E,F and G) highlighting the importance of these highly conserved amino acids.

Dimeric dUTPases differ from Mm_dCTPase in that their tertiary structure is dimeric rather than tetrameric (Supplementary Figure 1A and B). These enzymes likely evolved from a MazG-like ancestor where the basic subunit almost doubled in size through a gene duplication

12

and fusion event, following which one of the two domains lost the active site residues required for catalysis [8]. The dUTPase monomer therefore only contains one active site as opposed to the two active sites present in the Mm_dCTPase tight dimer (Supplementary Figure 1A and B). The tight dimer of Mm_dCTPase superimposes quite well with the dUTPase monomer, however the Mm_dCTPase tetramer superimposes poorly with the dUTPase dimer (Supplementary Figure 1C).

When compared to the S. solfataricus MazG tetramer, both the tight dimer and tetramer of this structure superimpose nicely with those of Mm_dCTPase (Supplementary Figure 2), suggesting that structurally the enzyme is more closely related to S. solfataricus MazG than it is to dimeric dUTPases. We also compared Mm_dCTPase to E. coli MazG which displays a very different tertiary structure to any of the aforementioned enzymes. E. coli MazG exists as two monomers with distinct N- and C-terminal domains (Supplementary Figure 3). Two dimers are formed by interactions between the N-terminal domains and the C-terminal domains only, generating a dimer of N-terminal domains and second dimer of C-terminal domains (Supplementary Figure 3). Interestingly, only the C-terminal domain dimer contains a functional active site for canonical nucleotides, which is proposed to result from a tightly formed additional region present in the C-terminal domain that is absent in the N-terminal domain [45]. The N-terminal domain has however been proposed to bind non-canonical nucleotides [8, 48]. The tight dimer of Mm_dCTPase superimposes well with both dimers, but superimposes better overall with the E. coli MazG N-terminal domain dimer (Supplementary Figure 3C and D).

Comparison of substrate recognition in Mm_dCTPase and all-α α NTP pyrophosphatases. The cytosine base of dCMPNPP bound in the Mm_dCTPase structure is recognized by two histidines (His38 and His51) and is further supported by hydrophobic interactions with two

13

tryptophans (Trp47 and Trp73) and Tyr102 (Supplementary Figure 4A). Analysis of uridine recognition in the structures of structures of L. major, T. cruzi and C. jejuni dUTPases indicates a very different hydrogen bond network pattern from Mm_dCTPase (Supplementary Figure 4B). In the dUTPase structures a consistent feature in uridine recognition is the presence of two polar residues; a glutamine which interacts with the O2 atom of the base and an asparagine which hydrogen bonds to the N3 atom (Supplementary Figure 4B). In addition there is a tryptophan in L. major and T. cruzi dUTPase which bonds to the O4 atom of uridine. In the C. jejuni dUTPase structure this residue is a hisitidine and there is an additional hydrogen bond between an asparagine and the O4 atoms which isn’t observed in the L. major or T. cruzi proteins which have an aspartate at the equivalent position. In all of the dUTPases the uridine base is also supported by hydrophobic interactions with two tryptophans and a phenylalanine (Supplementary Figure 4B). Superposition of the Mm_dCTPase with C. jejuni dUTPase highlights the very different base recognition patterns between these enzymes clearly indicating why the mouse enzyme prefers dCTP as a substrate over dUTP, as dCTP can be better positioned within the active site based on the surrounding residues. Interestingly E. coli MazG does not display the substrate selectivity observed for mammalian dCTPase or all-alpha dUTPases and can hydrolyze all canonical nucleotides [45]. Superposition of Mm_dCTPase with the C-terminal domain dimer of E. coli MazG bound with ATP shows that the nucleotide is supported exclusively by hydrophobic interactions involving two phenylalanines and a tryptophan and no hydrogen bond interactions are present. This explains why E. coli MazG displays such promiscuous enzyme activity compared to mammalian dCTPase and dimeric dUTPases which utilize distinct hydrogen bond interactions for nucleotide recognition.

Sequence comparisons of mammalian dCTPases

14

The sequence of mouse dCTPase (Mus musculus: NP_075692.1) was compared to the dCTPase sequences from a wide variety of mammalian species including Homo sapiens (human dCTPase: NP_077001.1), Rattus norvegicus (rat dCTPase: NP_620247.1), Pan troglodytes

(chimpanzee

dCTPase: PNI12219.1),

Ovis

aries

(sheep

dCTPase:

XP_004020953.1), Bos taurus (cow dCTPase: NP_001033291.1), Eptesicus fuscus (bat dCTPase: XP_008151046.1), Dasypus novemcinctus (armadillo dCTPase: XP_004479423.1), Equus caballus (horse dCTPase: XP_001501449.3), Tursiops truncates (dolphin dCTPase: XP_019777314.1), Physeter catodon (sperm whale: XP_007107449.1), Felis catus (cat dCTPase: XP_003998704.1) and Odobenus rosmarus divergens (walrus dCTPase: XP_004397278.1). Mm_dCTPase was shown to have a high level of sequence identity with each of the dCTPase orthologues, ranging from 76 to 91 %. Of these, rat dCTPase was the most similar to Mm_dCTPase and human dCTPase was the relatively least similar. A multiple sequence alignment was performed using Clustal Omega [49], which revealed a very high level of conservation (Figure 7). Specifically, the residues of the dCMPNPP complex structure (Figure 2) coordinating either magnesium ions (Glu63, Glu66, Glu95 and Asp98), hydrogen bonding (His38, His51, Asn125, Arg128 and Try129), hydrophobic interactions (Trp47, Trp73 and Ile101) or C−H⋅⋅⋅π interactions (Tyr102) are all entirely conserved (Figure 7). Residues from the N-terminal end of the protein (aa 1-21), which are absent in our truncated construct (aa 22-134) were generally less well conserved (Figure 7). Overall, the absolute conservation of active site residues between these very different species suggests that the dCTPase enzyme performs a highly important function in mammalian organisms.

DISCUSSION The dCTPase DCTPP1 has recently been shown to play a critical role in the maintenance of pyrimidine nucleotide homeostasis in cells, with downregulation of the enzyme resulting in an activated DNA damage response that adversely affects both chromosomal and mitochondrial 15

DNA [7]. In addition, DCTPP1 also displays high activity towards modified cytosine nucleotides such as 5-Me-dCTP [20], which if incorporated into DNA could induce base mismatches or random changes in DNA methylation pattern. We have determined the first structures of an active mammalian dCTPase in complex with both a substrate analogue and reaction products. The structure of Mm_dCTPase solved in complex with the nonhydrolysable substrate analogue dCMPNPP and magnesium ions, provides clear insights into the catalytic mechanism of this important enzyme. The two magnesium ions (M1 and M2) in the structure are coordinated by several residues which allow for the correct positioning of water molecules in the active site, one of which is needed to for nucleophilic attack on the βphosphate of the substrate. M2 also hydrogen bonds with oxygen atoms from both the α- and β-phosphates of the substrate, activating the latter for nucleophilic attack by the incoming water. There is an ordered water molecule (w13) located 3.2 Å from the β-phosphate of dCMPNPP, which is an excellent candidate for the attacking nucleophile. The angle between w13, the β-phosphate, and the substrate leaving group is 154°, placing this water in an optimal position for an in-line attack. W13 also coordinates M2 (2.1 Å), Glu66 (2.9 Å) and Glu95 (2.5 Å), one of which could act as a base to activate W13 for nucleophilic attack. The interactions between Arg128 and Tyr129 and the α-phosphate may facilitate the departure of the nucleotide monophosphate. Arg128 plays an important role in both the initial positioning of the triphosphate group for nucleophilic attack and the stabilization of the monophosphate product, as evidenced by its significant movement when comparing the substrate and product bound complexes. Mammalian dCTPase greatly favours the catalysis of dCTP over other pyrimidine nucleotides [20, 21]. In our structures the cytosine base is positioned by a pi-stacking interaction with Trp47 and by two hydrogen bond interactions between His38 and an oxygen atom and His51 with a nitrogen atom (Figure 2, Figure 4). His38 would make the equivalent hydrogen bond

16

interaction with uracil and thymidine bases as there is an oxygen atom located in the same position as in present in cytosine. However, the nitrogen in cytosine that interacts with His51 is a hydrogen bond acceptor, whereas in both thymidine and uracil the equivalent nitrogen is a hydrogen bond donor. This important difference is likely the main determinant of the preference towards cytosine dNTPs. In addition, the pi-stacking interaction between Trp47 and the base may be stronger for cytosine compared to uracil or thymine. This notion is supported by comparisons with structures of dimeric dUTPases that preferentially hydrolyse dUTP and which exhibit a very different nucleobase hydrogen bond network compared to Mm_dCTPase. The catalytic mechanisms of Mm_dCTPase is more similar to dimeric all-alpha dUTPases that to all-beta dUTPases. Trimeric dUTPases are structurally distinct to all-alpha NTP pyrophosphohydrolases. These enzymes are comprised predominantly of beta structure and exist in numerous species including humans, bacteria, yeast and bacteriophages [50-54]. The human all-beta dUTPase trimer has three active sites, each of which is formed by contributions from all monomers [53, 55]. Interestingly, while dimeric dUTPase can hydrolyze both dUDP and dUTP, trimeric dUTPases are inhibited by dUDP [56-58]. In the proposed mechanism for dimeric dUTPases two metal ions are required; the first metal activates the water for an inline nucleophilic attack on the β-phosphate, whereas the second stabilizes the transition state and aids in the departure of dUMP [41]. However, in trimeric dUTPases a single metal ion aids in substrate binding and stabilizes the transition state. Furthermore, it is a conserved aspartate that activates a water molecule for nucleophilic attack, which occurs on the α-phosphate from the opposite direction, and not via an inline attack as is the case with dimeric dUTPases and Mm_dCTPase [51, 53, 59].

17

The lack of activity towards dUDP by all-beta dUTPases is due to the flexible C-terminal tail of the enzyme which in the presence of dUTP makes several interactions with the γ-phosphate of the nucleotide. This is important for correctly positioning the substrate for nucleophilic attack on the α-phosphate [51, 60-64] and since dUDP lacks a γ-phosphate the C-terminal tail does not fold over and the α-phosphate is poorly positioned as a result. Perhaps the inactivity of mammalian dCTPase towards dCDP [20, 21] is due to similar reasons as is the case for the lack of activity of all-beta dUTPases towards dUDP; the extensive interactions between the γphosphate of dCTP and Mm_dCTPase are not present with dCDP and therefore the βphosphate of the diphosphate substrate cannot be correctly positioned for efficient nucleophilic attack. In contrast, in the structures of dimeric dUTPases the γ-phosphate has very few direct interactions with the enzyme and the correct geometry is able to achieved by interactions between the protein and the α and β-phosphates, meaning that dUDP can be hydrolyzed just as efficiently as dUTP [41].

Previous structures of mammalian dCTPase were solved in the absence of metal ions using an inactive protein [24]. Analysis of our Mm_dCTPase complexes unambiguously clarify why these structures were catalytically inactive. The structure of inactive mouse dCTPase (RSCUT) was solved using a truncated form (aa 21-126) of the enzyme [24]. It was reported that the full length protein was unstable in solution, prompting the truncation of the unstable portions of the N- and C- termini [24]. The structures of active M. musculus dCTPase reported here, were also obtained using a truncated version of the enzyme (aa 22-134). This construct is slightly less truncated in the C-terminal region, which as shown in the Mm_dCTPase-dCMPNPP complex, has important implications. Specifically, our substrate and product bound structures show that the residues Arg128 and Tyr129, which are absent in the RSCUT structures, play critical roles in the positioning of the triphosphate group and the

18

stabilization of the monophosphate leaving group. The RSCUT protein still retains the residues required for metal ion coordination, however the structure of RSCUT in complex with 5Me-dCTP was obtained by soaking apo RSCUT crystals without metal ions. Large differences in ligand binding are observed when comparing active Mm_dCTPase-dCMPNPP with the inactive RSCUT-5Me-dCTP construct, likely a consequence of the absence of the magnesium ions and the residues Arg128 and Try129, which together were required for coordination of the tri-phosphate moiety. However, it should be noted that activity assays using RSCUT showed no activity even in the presence of magnesium ions, suggesting that metal binding alone is not sufficient for catalysis [24]. In addition to determining the kinetic parameters for Mm_dCTPase (22-134) we also determined these values for the full length enzyme. The enzyme activity and substrate binding in the truncated protein were negatively affected compared to the full length enzyme, indicating that these removed regions are clearly of importance.

A multiple sequence comparison of mouse dCTPase with a diverse range of mammalian dCTPase orthologues, including human, showed strict conservation of the residues contributing to substrate binding (Figure 7). The complete sequence conservation observed between these very different species suggest a crucial functions of this enzyme throughout the animal kingdom. The dCTPase protein is known to exhibit a highly important function in pyrimidine homeostasis. In addition, dCTPase protects the genome integrity through clearing the nucleotide pool from dCTP-nucleosides with adducts at the 5-position, importantly 5methyl-dCTP, and thereby preventing its incorporation into DNA. Keeping the level of 5methylcytosine in the DNA low is highly important since spontaneous deamination of 5methylcytosine forms thymine and consequently results in a mutation. Moreover, since 5methylation of cytosine is the major epigenetic marker in eukaryotic cells incorporation of 5-

19

methyl-dCTP into DNA must be prevented since it would produce unwanted epigenetic marking and potentially result in altered gene expression.

In highly proliferative tissues with an increased demand for DNA synthesis and an expanded dNTP pool, as is observed in numerous cancer types, human DCTPP1 is highly upregulated [25, 36, 37]. In recent years benzimidazole [65], piperazin-1-ylpyridazine [66] and triazolothiadiazole [67] derivatives have been developed and reported as novel human dCTPase inhibitors. DCTPP1 has recently been shown to be able to degrade the active form of the drug decitabine (5-aza-2’deoxycytidine) [65] used clinically for treating acute myeloid leukaemia [68, 69] and inhibitors of the enzyme have been shown to enhance the effects of the drug in HL60 leukemia cells [65-67]. These results have been further expanded by a study from Requena et al. who also showed that when HeLa cells are exposed to decitabine several other pyrimidine metabolic enzymes in addition to DCTPP1 such as dUTPase, thymidylate synthase (TS) and dCMP deaminase (CDA) are also upregulated [70]. The down regulation of DCTPP1 or UTPase was shown to enhance the cytotoxic effect of decitabine, resulting in uracil accumulation and misincorporation leading to DNA damage. Furthermore, the authors showed that decitabine and aza-dCMP are able to be deaminated to aza-deoxyuridine (azadUrd) and aza-dUMP, that latter of which is a potential TS inhibitor [70]. The structures presented in this work, in addition to their important mechanistic insights into mammalian dCTPase, will also provide an excellent basis for improving human DCTPP1 inhibitors, due to the complete conservation of the substrate binding pocket between mice and humans.

MATERIALS AND METHODS Cloning, Protein Overexpression and Purification of truncated (aa 22-134) and full length Mm_dCTPase

20

cDNA encoding full-length Mus musculus dCTPase (Mm_dCTPase_FL) optimized for E. coli expression was obtained from GeneArt (Life technologies) and subcloned into pET28a(+) vector (Novagen) using NdeI and NotI restriction sites. The truncated Mm_dCTPase encoding residues 22-134 including a hexa-histidine affinity tag, was PCR amplified from the aforementioned Mm_dCTPase_FL construct, and cloned into pET28a(+) vector between the NdeI and NotI restriction sites. Mm_dCTPase_FL and Mm_dCTPase (aa 22-134) were expressed in E. coli BL21 (DE3) R3 pRARE2 at 18 °C overnight, following induction by addition of IPTG to a final concentration of 0.5 mM. Cells were harvested, and freeze thawed in lysis buffer (0.1 M HEPES, 0.5 M NaCl, 10 mM imidazole, 10 % (v/v) glycerol, 0.5 mM TCEP, pH 8.0), added 1 tablet of Complete EDTA-free (Roche) and 250 U benzonase (Sigma) and subsequently sonicated. His-tagged Mm_dCTPase (aa 22-134) was purified via immobilized metal affinity chromatography (IMAC) using a HisTrap HP column (GE Healthcare) and eluted in buffer E (20 mM HEPES, 500 mM NaCl, 500 mM imidazole, 10% (v/v) glycerol, 0.5 mM TCEP, pH 7.5), followed by a size-exclusion chromatography step using a HiLoad 16/60 Superdex75 column (GE Healthcare) in SEC buffer (20 mM HEPES, 0.3 M NaCl, 10 % (v/v) glycerol, 0.5 mM TCEP pH 7.5. Following protein concentration to 20 mg ml-1 the Mm_dCTPase sample was aliquoted, flash frozen with liquid nitrogen and stored at -80 °C. The cell cultivation, protein production and purification steps were performed by the Protein Science Facility at the Karolinska Institutet. The hexa-histidine affinity tag was removed by thrombin (GE Healthcare) digestion in SEC buffer (20mM HEPES pH 7.5, 300 mM NaCl, 10% (v/v) glycerol) and passed over IMAC column (HisTrap HP, GE Healthcare) in SEC buffer. The flow-through fraction containing the free protein was further purified by SEC using a HiLoad 16/60 Superdex 200 (GE Lifesciences) in SEC buffer. The protein was subsequently concentrated, flash frozen with liquid nitrogen and stored at 80°C. The protein purity was assessed using SDS-PAGE.

21

Crystallization and Data Collection Aliquots of purified Mm_dCTPase (aa 22-134) (21-22 mg/mL) were preincubated with 23.4 mM MgCl2 and either 23.4 mM 2’-deoxycytidine-5’-monophosphate (dCMP), 15 mM 5methyl-2’-deoxycytidine-5’-monophosphate (5Me-dCMP) or 23.4 mM 2’-deoxycytidine-5’[(α,β)-imido]triphosphate (dCMPNPP) for 1.5 hours at 20 °C. All proteins were crystallized via sitting drop vapor diffusion at 20 °C in 0.1 M Bis-Tris propane pH 7.5, 63 % tacsimate (Mm_dCTPase-dCMP), 0.1 M Tris pH 8.75, 2.4 M ammonium sulfate (Mm_dCTPase-5MedCMP) and 0.1 M Tris pH 8.0, 2.2 M ammonium sulfate (Mm_dCTPase-dCMPNPP). Protein crystals were cryo-protected using dry mineral oil and flash frozen in liquid. X-ray diffraction data was collected at the European Synchrotron Radiation Facility (Grenoble, France) and BESSY (Helmholtz Centrum Berlin, Germany). Datasets were collected at 100K on single crystals at a wavelength of either 0.8726 Å (Mm_dCTPase-dCMP), 1.072 Å (Mm_dCTPase5Me-dCMP) or 0.8726 Å (Mm_dCTPase-dCMPNPP).

Structure Determination and Refinement All datasets were indexed and integrated using XDS [71], followed by scaling using Aimless [72] within the CCP4 suite [73]. Structures were solved via molecular replacement with Phaser [74] using the monomer of M. musculus dCTPase (PDB id: 2oie) as the search model. This was performed assuming two monomers per asymmetric unit, in accordance with the calculated Matthews coefficient [75]. Several round of manual model building and refinement were performed using Coot [76] and Refmac5 [77] during which waters and ligands were incorporated into the structure. Data processing and refinement statistics are presented in Table 1. The coordinates and structure factors for Mm_dCTPase-dCMP, Mm_dCTPase-5Me-

22

dCMP and Mm_dCMPNPP were deposited in the PDB under codes 6sqy, 6sqw and 6sqz, respectively.

Kinetic analysis of truncated (aa 22-134) and full length M. musculus dCTPase Initial reaction rates were determined by incubating truncated Mm_dCTPase (aa 22-134) (300 nM) and full length Mm_dCTPase (30 nM) with dCTP (0-400 µM) or 5-methyl-dCTP (0-150 µM) at 22 °C for 0, 10, 20 and 30 minutes in dCTPase assay buffer (100 mM Tris-acetate pH 8.0, 40 mM NaCl, 10 mM MgAc, 1 mM DTT, 0.005% Tween 20) with shaking. Sampling for each time point and substrate concentration was performed in duplicates. The reaction product PPi was monitored using PPiLight Inorganic Pyrophosphate Assay (Lonza) according to the manufacturer’s recommendations. Initial rates were calculated using linear regression and the Michaelis-Menten equation was fitted to the data points using GraphPad Prism 8.0.

ACCESSION NUMBERS The atomic coordinates and structure factors for the structures of M. musculus dCTPase in complex with dCMPNPP, dCMP and 5Me-dCMP have been deposited in the Protein Data Bank under the accession codes 6sqz, 6sqy and 6sqw, respectively.

ACKNOWLEDGEMENTS This work was supported by the Swedish Research Council (T. Helleday, P. Stenmark), the Knut and Alice Wallenberg Foundation (T. Helleday, P. Stenmark), the Göran Gustafsson Foundation, the Swedish Children’s Cancer Foundation, the Swedish Pain Relief Foundation, and the Torsten and Ragnar Söderberg Foundation (T. Helleday) and the Swedish Cancer Society (T. Helleday, P. Stenmark). The Crafoord foundation (P. Stenmark). We thank the

23

beamline scientists at the European Synchrotron Radiation Facility (ESRF) and BESSY for their support in structural biology data collection. We also thank PSF for protein production.

AUTHOR CONTRIBUTIONS P.S. and T.H. conceived the project and supervised the study. M.C. conducted crystallization experiments and collected data. A.-S.J. conducted biochemical experiments. E.S. refined structures, analysed the results and wrote the manuscript. All authors reviewed the results, contributed to the writing, and approved the final version of the manuscript.

COMPETING FINANCIAL INTERST STATEMENT The authors declare no competing financial interests.

REFERENCES [1] Meuth M. The molecular basis of mutations induced by deoxyribonucleoside triphosphate pool imbalances in mammalian cells. Exp Cell Res. 1989;181:305-16. [2] Mathews CK. DNA precursor metabolism and genomic stability. FASEB J. 2006;20:130014. [3] Rampazzo C, Miazzi C, Franzolin E, Pontarin G, Ferraro P, Frangini M, et al. Regulation by degradation, a cellular defense against deoxyribonucleotide pool imbalances. Mutat Res-Gen Tox En. 2010;703:2-10. [4] Kumar D, Abdulovic AL, Viberg J, Nilsson AK, Kunkel TA, Chabes A. Mechanisms of mutagenesis in vivo due to imbalanced dNTP pools. Nucleic Acids Res. 2011;39:1360-71. [5] Franzolin E, Pontarin G, Rampazzo C, Miazzi C, Ferraro P, Palumbo E, et al. The deoxynucleotide triphosphohydrolase SAMHD1 is a major regulator of DNA precursor pools in mammalian cells. Proc Natl Acad Sci U S A. 2013;110:14272-7. [6] Nordlund P, Reichard P. Ribonucleotide reductases. Annu Rev Biochem. 2006;75:681-706. [7] Martinez-Arribas B, Requena CE, Perez-Moreno G, Ruiz-Perez LM, Vidal AE, GonzalezPacanowska D. DCTPP1 prevents a mutator phenotype through the modulation of dCTP, dTTP and dUTP pools. Cell Mol Life Sci. 2019. [8] Moroz OV, Murzin AG, Makarova KS, Koonin EV, Wilson KS, Galperin MY. Dimeric dUTPases, HisE, and MazG belong to a new superfamily of all-alpha NTP pyrophosphohydrolases with potential "house-cleaning" functions. J Mol Biol. 2005;347:24355. [9] Hidalgo-Zarco F, Camacho AG, Bernier-Villamor V, Nord J, Ruiz-Perez LM, GonzalezPacanowska D. Kinetic properties and inhibition of the dimeric dUTPase-dUDPase from Leishmania major. Protein Sci. 2001;10:1426-33.

24

[10] Camacho A, Hidalgo-Zarco F, Bernier-Villamor V, Ruiz-Perez LM, Gonzalez-Pacanowska D. Properties of Leishmania major dUTP nucleotidohydrolase, a distinct nucleotidehydrolysing enzyme in kinetoplastids. Biochem J. 2000;346 Pt 1:163-8. [11] Bernier-Villamor V, Camacho A, Hidalgo-Zarco F, Perez J, Ruiz-Perez LM, GonzalezPacanowska D. Characterization of deoxyuridine 5'-triphosphate nucleotidohydrolase from Trypanosoma cruzi. FEBS Lett. 2002;526:147-50. [12] Harkiolaki M, Dodson EJ, Bernier-Villamor V, Turkenburg JP, Gonzalez-Pacanowska D, Wilson KS. The crystal structure of Trypanosoma cruzi dUTPase reveals a novel dUTP/dUDP binding fold. Structure. 2004;12:41-53. [13] Zhang J, Inouye M. MazG, a nucleoside triphosphate pyrophosphohydrolase, interacts with Era, an essential GTPase in Escherichia coli. J Bacteriol. 2002;184:5323-9. [14] Zhang J, Zhang Y, Inouye M. Thermotoga maritima MazG protein has both nucleoside triphosphate pyrophosphohydrolase and pyrophosphatase activities. J Biol Chem. 2003;278:21408-14. [15] Zimmerman SB, Kornberg A. Deoxycytidine di- and triphosphate cleavage by an enzyme formed in bacteriophage-infected Eschrichia coli. J Biol Chem. 1961;236:1480-6. [16] Greenberg GR. New dUTPase and dUDPase activites after infection of Escherichia coli by T2 bacteriophage. Proc Natl Acad Sci U S A. 1966;56:1226-32. [17] Allen JR, Lasser GW, Goldman DA, Booth JW, Mathews CK. T4 phage deoxyribonucleotide-synthesizing enzyme complex. Further studies on enzyme composition and regulation. J Biol Chem. 1983;258:5746-53. [18] Smith DW, Ames BN. Phosphoribosyladenosine Monophosphate, an Intermediate in Histidine Biosynthesis. J Biol Chem. 1965;240:3056-63. [19] Keesey JK, Jr., Bigelis R, Fink GR. The product of the his4 gene cluster in Saccharomyces cerevisiae. A trifunctional polypeptide. J Biol Chem. 1979;254:7427-33. [20] Requena CE, Perez-Moreno G, Ruiz-Perez LM, Vidal AE, Gonzalez-Pacanowska D. The NTP pyrophosphatase DCTPP1 contributes to the homoeostasis and cleansing of the dNTP pool in human cells. Biochem J. 2014;459:171-80. [21] Nonaka M, Tsuchimoto D, Sakumi K, Nakabeppu Y. Mouse RS21-C6 is a mammalian 2'deoxycytidine 5'-triphosphate pyrophosphohydrolase that prefers 5-iodocytosine. Febs J. 2009;276:1654-66. [22] Evans MD, Dizdaroglu M, Cooke MS. Oxidative DNA damage and disease: induction, repair and significance. Mutat Res. 2004;567:1-61. [23] Topal MD, Baker MS. DNA precursor pool: a significant target for N-methyl-Nnitrosourea in C3H/10T1/2 clone 8 cells. Proceedings of the National Academy of Sciences. 1982;79:2211-5. [24] Wu B, Liu Y, Zhao Q, Liao S, Zhang J, Bartlam M, et al. Crystal structure of RS21-C6, involved in nucleoside triphosphate pyrophosphohydrolysis. J Mol Biol. 2007;367:1405-12. [25] Song FF, Xia LL, Ji P, Tang YB, Huang ZM, Zhu L, et al. Human dCTP pyrophosphatase 1 promotes breast cancer cell growth and stemness through the modulation on 5-methyl-dCTP metabolism and global hypomethylation. Oncogenesis. 2015;4:e159. [26] Mathews CK. Deoxyribonucleotide metabolism, mutagenesis and cancer. Nat Rev Cancer. 2015;15:528-39. [27] Rudd SG, Valerie NCK, Helleday T. Pathways controlling dNTP pools to maintain genome stability. DNA Repair (Amst). 2016;44:193-204. [28] Gad H, Koolmeister T, Jemth A-S, Eshtad S, Jacques SA, Ström CE, et al. MTH1 inhibition eradicates cancer by preventing sanitation of the dNTP pool. Nature. 2014;508:215.

25

[29] Jemth AS, Gustafsson R, Brautigam L, Henriksson L, Vallin KSA, Sarno A, et al. MutT homologue 1 (MTH1) catalyzes the hydrolysis of mutagenic O6-methyl-dGTP. Nucleic Acids Res. 2018;46:10888-904. [30] McIntosh EM, Ager DD, Gadsden MH, Haynes RH. Human dUTP pyrophosphatase: cDNA sequence and potential biological importance of the enzyme. Proc Natl Acad Sci U S A. 1992;89:8020-4. [31] Abolhassani N, Iyama T, Tsuchimoto D, Sakumi K, Ohno M, Behmanesh M, et al. NUDT16 and ITPA play a dual protective role in maintaining chromosome stability and cell growth by eliminating dIDP/IDP and dITP/ITP from nucleotide pools in mammals. Nucleic Acids Res. 2010;38:2891-903. [32] Sakumi K, Abolhassani N, Behmanesh M, Iyama T, Tsuchimoto D, Nakabeppu Y. ITPA protein, an enzyme that eliminates deaminated purine nucleoside triphosphates in cells. Mutat Res. 2010;703:43-50. [33] Pang B, McFaline JL, Burgis NE, Dong M, Taghizadeh K, Sullivan MR, et al. Defects in purine nucleotide metabolism lead to substantial incorporation of xanthine and hypoxanthine into DNA and RNA. Proc Natl Acad Sci U S A. 2012;109:2319-24. [34] Goldstone DC, Ennis-Adeniran V, Hedden JJ, Groom HC, Rice GI, Christodoulou E, et al. HIV-1 restriction factor SAMHD1 is a deoxynucleoside triphosphate triphosphohydrolase. Nature. 2011;480:379-82. [35] Amie SM, Bambara RA, Kim B. GTP is the primary activator of the anti-HIV restriction factor SAMHD1. J Biol Chem. 2013;288:25001-6. [36] Zhang Y, Ye WY, Wang JQ, Wang SJ, Ji P, Zhou GY, et al. dCTP pyrophosphohydrase exhibits nucleic accumulation in multiple carcinomas. Eur J Histochem. 2013;57:e29. [37] Morisaki T, Yashiro M, Kakehashi A, Inagaki A, Kinoshita H, Fukuoka T, et al. Comparative proteomics analysis of gastric cancer stem cells. Plos One. 2014;9:e110736. [38] Friese A, Kapoor, S., Schneidewind, T., Vidadala, S.R., Sardana, J., Brause, A., Förster, T., Bischoff, M., Wagner., J., Janning, P., Ziegler, S. & Waldmann, H. Chemical Genetics Reveals a Role of dCTP Pyrophosphatase 1 in Wnt Signaling. Angewante Chemie International Edition. 2019;58:13009. [39] Krissinel E, Henrick K. Inference of macromolecular assemblies from crystalline state. Journal of Molecular Biology. 2007;372:774-97. [40] Holm L, Rosenstrom P. Dali server: conservation mapping in 3D. Nucleic Acids Res. 2010;38:W545-W9. [41] Hemsworth GR, Gonzalez-Pacanowska D, Wilson KS. On the catalytic mechanism of dimeric dUTPases. Biochem J. 2013;456:81-8. [42] Hemsworth GR, Moroz OV, Fogg MJ, Scott B, Bosch-Navarrete C, Gonzalez-Pacanowska D, et al. The crystal structure of the Leishmania major deoxyuridine triphosphate nucleotidohydrolase in complex with nucleotide analogues, dUMP, and deoxyuridine. J Biol Chem. 2011;286:16470-81. [43] Moroz OV, Harkiolaki M, Galperin MY, Vagin AA, Gonzalez-Pacanowska D, Wilson KS. The crystal structure of a complex of Campylobacter jejuni dUTPase with substrate analogue sheds light on the mechanism and suggests the "basic module" for dimeric d(C/U)TPases. J Mol Biol. 2004;342:1583-97. [44] Shlomai J, Kornberg A. Deoxyuridine triphosphatase of Escherichia coli. Purification, properties, and use as a reagent to reduce uracil incorporation into DNA. J Biol Chem. 1978;253:3305-12.

26

[45] Lee S, Kim MH, Kang BS, Kim JS, Kim GH, Kim YG, et al. Crystal structure of Escherichia coli MazG, the regulator of nutritional stress response. J Biol Chem. 2008;283:15232-40. [46] Goncalves AM, de Sanctis D, McSweeney SM. Structural and functional insights into DR2231 protein, the MazG-like nucleoside triphosphate pyrophosphohydrolase from Deinococcus radiodurans. J Biol Chem. 2011;286:30691-705. [47] Kim MI, Hong M. Crystal structure of the Bacillus-conserved MazG protein, a nucleotide pyrophosphohydrolase. Biochem Biophys Res Commun. 2016;472:237-42. [48] Galperin MY, Moroz OV, Wilson KS, Murzin AG. House cleaning, a part of good housekeeping. Mol Microbiol. 2006;59:5-19. [49] Chojnacki S, Cowley A, Lee J, Foix A, Lopez R. Programmatic access to bioinformatics tools from EMBL-EBI update: 2017. Nucleic Acids Res. 2017;45:W550-W3. [50] Maiques E, Quiles-Puchalt N, Donderis J, Ciges-Tomas JR, Alite C, Bowring JZ, et al. Another look at the mechanism involving trimeric dUTPases in Staphylococcus aureus pathogenicity island induction involves novel players in the party. Nucleic Acids Res. 2016;44:5457-69. [51] Chan S, Segelke B, Lekin T, Krupka H, Cho US, Kim MY, et al. Crystal structure of the Mycobacterium tuberculosis dUTPase: insights into the catalytic mechanism. J Mol Biol. 2004;341:503-17. [52] Tchigvintsev A, Singer AU, Flick R, Petit P, Brown G, Evdokimova E, et al. Structure and activity of the Saccharomyces cerevisiae dUTP pyrophosphatase DUT1, an essential housekeeping enzyme. Biochem J. 2011;437:243-53. [53] Barabas O, Pongracz V, Kovari J, Wilmanns M, Vertessy BG. Structural insights into the catalytic mechanism of phosphate ester hydrolysis by dUTPase. J Biol Chem. 2004;279:42907-15. [54] Varga B, Barabas O, Kovari J, Toth J, Hunyadi-Gulyas E, Klement E, et al. Active site closure facilitates juxtaposition of reactant atoms for initiation of catalysis by human dUTPase. FEBS Lett. 2007;581:4783-8. [55] Kovari J, Imre T, Szabo P, Vertessy BG. Mechanistic studies of dUTPases. Nucleosides Nucleotides Nucleic Acids. 2004;23:1475-9. [56] Leang RS, Wu TT, Hwang S, Liang LT, Tong L, Truong JT, et al. The anti-interferon activity of conserved viral dUTPase ORF54 is essential for an effective MHV-68 infection. PLoS Pathog. 2011;7:e1002292. [57] Madrid AS, Ganem D. Kaposi's sarcoma-associated herpesvirus ORF54/dUTPase downregulates a ligand for the NK activating receptor NKp44. J Virol. 2012;86:8693-704. [58] Larsson G, Nyman PO, Kvassman JO. Kinetic characterization of dUTPase from Escherichia coli. J Biol Chem. 1996;271:24010-6. [59] Vertessy BG, Toth J. Keeping uracil out of DNA: physiological role, structure and catalytic mechanism of dUTPases. Acc Chem Res. 2009;42:97-106. [60] Nord J, Kiefer M, Adolph HW, Zeppezauer MM, Nyman PO. Transient kinetics of ligand binding and role of the C-terminus in the dUTPase from equine infectious anemia virus. FEBS Lett. 2000;472:312-6. [61] Shao H, Robek MD, Threadgill DS, Mankowski LS, Cameron CE, Fuller FJ, et al. Characterization and mutational studies of equine infectious anemia virus dUTPase. Biochim Biophys Acta. 1997;1339:181-91. [62] Vertessy BG. Flexible glycine rich motif of Escherichia coli deoxyuridine triphosphate nucleotidohydrolase is important for functional but not for structural integrity of the enzyme. Proteins. 1997;28:568-79.

27

[63] Pecsi I, Szabo JE, Adams SD, Simon I, Sellers JR, Vertessy BG, et al. Nucleotide pyrophosphatase employs a P-loop-like motif to enhance catalytic power and NDP/NTP discrimination. Proc Natl Acad Sci U S A. 2011;108:14437-42. [64] Mustafi D, Bekesi A, Vertessy BG, Makinen MW. Catalytic and structural role of the metal ion in dUTP pyrophosphatase. Proc Natl Acad Sci U S A. 2003;100:5670-5. [65] Llona-Minguez S, Hoglund A, Jacques SA, Johansson L, Calderon-Montano JM, Claesson M, et al. Discovery of the First Potent and Selective Inhibitors of Human dCTP Pyrophosphatase 1. J Med Chem. 2016;59:1140-8. [66] Llona-Minguez S, Hoglund A, Ghassemian A, Desroses M, Calderon-Montano JM, Burgos Moron E, et al. Piperazin-1-ylpyridazine Derivatives Are a Novel Class of Human dCTP Pyrophosphatase 1 Inhibitors. J Med Chem. 2017;60:4279-92. [67] Llona-Minguez S, Hoglund A, Wiita E, Almlof I, Mateus A, Calderon-Montano JM, et al. Identification of Triazolothiadiazoles as Potent Inhibitors of the dCTP Pyrophosphatase 1. J Med Chem. 2017;60:2148-54. [68] Santos FP, Kantarjian H, Garcia-Manero G, Issa JP, Ravandi F. Decitabine in the treatment of myelodysplastic syndromes. Expert Rev Anticancer Ther. 2010;10:9-22. [69] Griffiths EA, Gore SD. Epigenetic therapies in MDS and AML. Adv Exp Med Biol. 2013;754:253-83. [70] Requena CE, Perez-Moreno G, Horvath A, Vertessy BG, Ruiz-Perez LM, GonzalezPacanowska D, et al. The nucleotidohydrolases DCTPP1 and dUTPase are involved in the cellular response to decitabine. Biochem J. 2016;473:2635-43. [71] Kabsch W. Xds. Acta Crystallographica Section D: Biological Crystallography. 2010;66:125-32. [72] Evans P. Scaling and assessment of data quality. Acta Crystallographica Section DStructural Biology. 2006;62:72-82. [73] Bailey S. The Ccp4 Suite - Programs for Protein Crystallography. Acta Crystallogr D. 1994;50:760-3. [74] Mccoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ. Phaser crystallographic software. J Appl Crystallogr. 2007;40:658-74. [75] Matthews BW. Solvent content of protein crystals. Journal of molecular biology. 1968;33:491-7. [76] Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60:2126-32. [77] Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr D. 1997;53:240-55.

FIGURE LEGENDS Figure 1. Overall structure of Mm_dCTPase in complex with dCMPNPP. (A) Structure of the Mm_dCTPase monomer, depicted as a ribbon representation. The secondary structure elements consisting of α-helices 1-4 are labelled. The non-hydrolysable

28

substrate analogue dCMPNPP is shown as a stick representation with its respective atoms colored; carbons yellow, oxygens red, nitrogens blue and phosphors orange. Magnesium ions required for activity and ligand coordination are displayed as magenta spheres. The respective N- and C-termini observed are labelled. (B) Crystal structure of dimeric Mm_dCTPase shown as cartoon, with the two monomers A and B colored green and blue, respectively and the dCMPNPP and magnesium ions shown as in A. (C) Structure of the Mm_dCTPase tetramer. Individual monomers A and C are colored green and monomers B and D are colored blue.

Figure 2. Substrate recognition of dCMPNPP by Mm_dCTPase. (A) The active site hydrogen bond network of Mm_dCTPase bound with dCMPNPP. Monomers A, C and D are shown as ribbon representations colored green, green, and blue, respectively. Residues contributing to ligand binding are labeled by residue number and respective chain, and colored; carbons according to monomer color, nitrogens blue, oxygens red, phosphors orange. The non-hydrolysable substrate dCMPNPP is shown as sticks, colored as in Figure 1. The magnesium ions involved in substrate coordination are shown as pink spheres, and ligand coordinating water molecules as red spheres. Interactions less than 3.5 Å are shown as dashed lines. The calculated 2Fo-Fc electron density map around dCMPNPP and the magnesium ions is shown in blue, contoured at 1.5 σ and the Fo-Fc electron density map is shown in red and green contoured at -3.5 σ and +3.5 σ, respectively. (B) Active site hydrogen bonding network of Mm_dCTPase, with bonding atoms shown as in A, highlighting the complete magnesium and phosphate coordination. The putative nucleophilic water (w13) appropriately positioned to attack the β-phosphate of the substrate is highlighted. The α-, βand γ-phosphates of dCMPNPP are indicated.

29

Figure 3. Enzyme kinetics of Mm_dCTPase (aa 22-134) and full length Mm_dCTPase with dCTP and 5-methyl-dCTP. (A) Saturation curves of Mm_dCTPase (aa 22-134) and Mm_dCTPase FL hydrolysis of dCTP (left) and 5-Me-dCTP (right). Substrate hydrolysis was determined by measuring PPi formation, which was detected using the PPi Light Inorganic Pyrophosphate Assay (Lonza). Data are presented as v (hydrolyzed substrate (µM) per minute per [enzyme]. Initial reaction rates were determined in duplicate. (B) Kinetic parameters of Mm_dCTPase (aa 22-134) and Mm_dCTPase FL for dCTP and 5-Me-dCTP hydrolysis. Kinetic values were determined by fitting the Michaelis-Menten equation to the initial rates using GraphPad Prism. Data are the average ± s.d from two independent experiments.

Figure 4. Product recognition of dCMP and 5Me-dCMP by Mm_dCTPase. The active site hydrogen bond network of Mm_dCTPase in complex with (A) dCMP and (B) 5Me-dCMP. Monomers A, C and D are shown as ribbon representations colored green, green, and blue, respectively. Residues contributing to the ligand binding are labeled by residue number and respective chain, and colored; carbons according to monomer color, nitrogens blue, oxygens red. Magnesium ions involved in substrate coordination are shown as magenta spheres. Water molecules are shown as red spheres. dCMP and 5Me-dCMP are shown as sticks with carbon atoms colored yellow. Interactions less than 3.5 Å are shown as dashed lines. The calculated 2Fo-Fc electron density maps around dCMP and 5Me-dCMP are shown in blue contoured at 1.5 σ and the Fo-Fc electron density maps are green and red contoured at +3.5 σ and -3.5 σ, respectively. Cα-atom superposition of Mm_dCTPase 5Me-dCMP with; (C) Mm_dCTPase dCMP and (D) Mm_dCTPase dCMPNPP. The 5Me-dCMP complex structure is colored as in B, except for the coordinating magnesium ion which is shown as a light pink sphere. The overall structures of Mm_dCTPase complexes with dCMP and dCMPNPP are colored white, with residues from these structures are shown as sticks and

30

colored; carbons white, nitrogens blue, oxygens red and phosphors orange. dCMP and dCMPNPP are depicted as gray stick models. Magnesium ions from the dCMP and dCMPNPP complexes are shown as magenta spheres.

Figure 5. Comparisons of active and inactive mouse dCTPase structures. Cα-atom superposition of active dCMPNPP bound mouse dCTPase with (A) inactive apo mouse dCTPase (PDB ID: 2oie) and (B) inactive 5Me-dCTP bound mouse dCTPase (PDB ID: 2oig). In the structure of active dCMPNPP bound Mm_dCTPase, Monomers A, C and D are shown as ribbon representations colored green, green, and blue, respectively. Amino acids contributing to ligand binding from monomers A, C and D are labelled. C atoms are colored according to monomer color, N atoms are colored blue, O atoms are colored red and P atoms are colored orange. Magnesium ions involved in substrate coordination are shown as magenta spheres. dCMPNPP is shown as a stick model; C atoms colored yellow. The overall structures of inactive apo and 5Me-dCTP bound dCTPase are colored white. 5Me-dCTP is depicted as a gray stick model. Hydrogen bond interactions for inactive mouse dCTPase with 5Me-dCTP are shown as dashed lines.

Figure 6. Structural comparisons of Mm_dCTPase with related enzymes from the all-α α NTP pyrophosphatase superfamily. (A) Multiple sequence alignment of the four core alpha helices (α1-α4) from the M. musculus dCTPase monomer (UniProt: Q9QY93) with the equivalent helices of T. cruzi dUTPase (UniProt: O15923), C. jejuni (UniProt: Q9PMK9), L. major (UniProt: O15826), E. coli (UniProt: P33646), S. solfataricus (UniProt: Q97U11) and D. radiodurans (UniProt: Q9RS96). Sequences were compared using Clustal Omega through the EBI webserver. The resulting alignment is presented colored according to sequence similarity using BOXSHADE. Identical residues are shaded black, while grey shading

31

indicates amino acids with conserved physicochemical properties. The Mg2+ binding residues which are entirely conserved between the proteins are shaded red. (B) Structure of the Mm_dCTPase monomer (dark blue) compared to the monomers of T. cruzi (yellow, PDB ID: 1OGK), L. major (pink, PDB ID: 2YAY) and C. jejuni (cyan, PDB ID: 2CIC). (C) The Mm_dCTPase monomer compared to the monomers of S. solfataricus (green, PDB ID: 1VMG) and D. radiodurans (orange, PDB ID: 2YFC). (D) The Mm_dCTPase monomer compared to the N-terminal (grey, PDB ID: 3CRC) and C-terminal (yellow, PDB ID: 3CRC) domains of E. coli MazG. Comparison of the Mg2+ coordinating residues of Mm_dCTPase (E63, E66, E95 and D98) with (E) T. cruzi (E49, E52, E77 and D80), L. major (E48, E51, E76 and E79) and C. jejuni (E46, E49, E74 and D77). dCMPNPP from Mm_dCTPase and dUPNPP from L. major and C. jejuni are shown as sticks. Magnesium ions from Mm_dCTPase and C. jejuni and the calcium ions from L. major are depicted as spheres. (F) Comparison with S. solfataricus (E35, E38, E54 and D57) and D. radiodurans (E47, E50, E79 and D82). (G) Comparison with the E. coli MazG N-terminal (E38, E42, E58 and D61) and C-terminal (E172, E175, E193 and D196) dimers. dCMPNPP from Mm_dCTPase and ATP from the E. coli C-terminal dimer are shown as sticks and the co-ordinating magnesium ions are depicted as spheres.

Figure 7. Sequence alignment of mouse dCTPase and mammalian dCTPases. Sequences were compared using Clustal Omega through the EBI webserver. The resulting alignment is presented colored according to sequence similarity using BOXSHADE. Identical residues are shaded black, while grey shading indicates amino acids with conserved physicochemical properties. Residues from the active site which are involved in metal coordination and substrate binding are indicated. Black diamonds denote residues involved in metal coordination, while white diamonds correspond to residues involved in hydrophobic

32

interactions. Residues from monomers C and D which are involved in hydrogen bond interactions are indicated by black squares and spotted diamonds, respectively. The secondary structure corresponding to the amino acid sequence of mouse dCTPase is displayed above the alignment, with alpha helices (α) and 310-helices (η) shown as boxes.

Supplementary Figure 1. Structural comparisons of Mm_dCTPase with all-α α dUTPases. (A) Cartoon representation of the Mm_dCTPase tetramer. Individual monomers A and C are colored dark blue and monomers B and D are colored purple. The non-hydrolysable substrate analogue dCMPNPP is shown as a stick representation with its respective atoms colored; carbons yellow, oxygens red, nitrogens blue and phosphors orange. (B) Cα-atom superposition of all-α dUTPases from T. cruzi (yellow, PDB ID: 1OGK), L. major (pink, PDB ID: 2YAY) and C. jejuni (cyan, PDB ID: 2CIC). The non-hydrolysable analogue dUPNPP from the C. jejuni structure is shown as a black stick model. (C) Cα-atom superposition of Mm_dCTPase tetramer with the C. jejuni dUTPase dimer. The coloring is consistent with panels A and B except for the second dimer of the C. jejuni structure which is shown as a transparent cyan cartoon.

Supplementary Figure 2. Structural comparisons of Mm_dCTPase with S. solfataricus MazG. Cα-atom superposition of the Mm_dCTPase tetramer with the S. solfataricus MazG tetramer (PDB ID: 1VMG). The individual monomers of Mm_dCTPase are colored dark blue (monomers A and C) and purple (monomers B and D). The non-hydrolysable substrate analogue dCMPNPP is shown as a stick representation with its respective atoms colored; carbons yellow, oxygens red, nitrogens blue and phosphors orange. The individual monomers of S. solfataricus MazG are colored dark green (monomers A and C) and light green (monomers B and D).

33

Supplementary Figure 3. Structural comparisons of Mm_dCTPase with E. coli MazG. (A) Cartoon representation of the Mm_dCTPase tetramer. Individual monomers A and C are colored dark blue and monomers B and D are colored purple. The non-hydrolysable substrate analogue dCMPNPP is shown as a stick representation with its respective atoms colored; carbons yellow, oxygens red, nitrogens blue and phosphors orange. (B) Cartoon representation of E. coli MazG (PDB ID: 3CRC). The overall structure is formed by interactions between the N- and C-terminal domains of one monomer (shown in grey and yellow, respectively) and the N- and C-terminal domains of the second monomer (shown in light grey and light yellow, respectively). This results in dimer comprised of only N-terminal domains and a dimer comprised of only C-terminal domains. ATP present in the C-terminal domain dimer is shown as a black stick model. (C) Cα-atom superposition of the Mm_dCTPase tight dimer with the MazG N-terminal domain dimer. (D) Cα-atom superposition of the Mm_dCTPase tight dimer with the MazG C-terminal domain dimer.

Supplementary Figure 4. Structural comparisons showing nucleobase recognition in the all-α α NTP pyrophosphatase superfamily. (A) Mm_dCTPase

hydrogen bond network

focusing on the recognition of the cytosine base of dCMPNPP. Residues contributing to ligand binding are labeled by residue number and respective chain, and colored; carbons white, nitrogens blue, oxygens red. The non-hydrolysable substrate dCMPNPP is shown as sticks. (B) Cα-atom superposition of all-α dUTPases from T. cruzi (yellow, PDB ID: 1OGK), L. major (pink, PDB ID: 2YAY) and C. jejuni (cyan, PDB ID: 2CIC) highlighting the recognition or uridine by the enzymes. The non-hydrolysable analogue dUPNPP from L. major and C. jejuni and dUDP from T. cruzi are shown as sticks. (C) Cα-atom superposition of Mm_dCTPase (white) and C. jejuni dUTPase (cyan). (D) Cα-atom superposition of

34

Mm_dCTPase (white) and the C-terminal domain dimer of E. coli MazG (orange, PDB ID: 3CRC).

35

Table 1. Data Collection and Refinement Statistics

PDB code Data collection Space group Cell dimensions a, b, c (Å) α, β, γ (°) Resolution (Å) No. observations No. unique reflections Rmerge CC(1/2) I/σI Completeness (%) Redundancy Refinement Resolution (Å) No. reflections Rwork/Rfree (%) No. of atoms Protein Ligand/ion Water B-factors Protein (Å2) Ligand/ion (Å2) Water (Å2) RMSDs Bond lengths (Å) Bond angles (°) Ramachandran plot (%) Favoured Allowed Disallowed

dCMPNPP_dCTPase 6sqz

5-Me-dCMP_dCTPase 6sqw

dCMP_dCTPase 6sqy

P41212

P41212

P41212

59.0, 59.0, 142.7 90.0, 90.0, 90.0 49.5—1.9 279241 (26382) 20663 (2015) 0.10 (1.14) 0.99 (0.64) 18.2 (2.4) 100.0 (100.0) 13.5 (13.1)

58.5, 58.5, 140.9 90.0, 90.0, 90.0 45.0—1.8 328049 (32921) 23533 (2289) 0.13 (1.36) 0.99 (0.50) 12.2 (1.7) 100.0 (100.0) 13.9 (14.4)

59.3, 59.3, 140.5 90.0, 90.0, 90.0 45.3—1.9 236278 (22981) 20554 (1996) 0.10 (1.21) 0.99 (0.63) 16.4 (2.1) 100.0 (100.0) 11.5 (11.5)

54.5—1.90 19540 18.7/21.7

54.0—1.80 22280 19.2/21.5

54.6—1.90 19467 21.5/17.9

1837 60 153

1879 44 178

1888 42 178

38.3 41.6 45.3

35.5 30.4 42.0

33.8 35.1 43.9

0.014 1.65

0.014 1.60

0.013 1.52

99.1 0.9 0

99.1 0.9 0

99.1 0.9 0

Values in parentheses are for the highest-resolution shell