Structure, Vol. 11, 1369–1379, November, 2003, 2003 Elsevier Science Ltd. All rights reserved.
DOI 10.1016/j.str.2003.10.001
Structural Insights into Single-Stranded DNA Binding and Cleavage by F Factor TraI Saumen Datta,1 Chris Larkin,1 and Joel F. Schildbach* Department of Biology Johns Hopkins University 3400 North Charles Street Baltimore, Maryland 21218
Summary Conjugative plasmid transfer between bacteria disseminates antibiotic resistance and diversifies prokaryotic genomes. Relaxases, proteins essential for conjugation, cleave one plasmid strand sequence specifically prior to transfer. Cleavage occurs through a Mg2ⴙ-dependent transesterification involving a tyrosyl hydroxyl and a DNA phosphate. The structure of the F plasmid TraI relaxase domain, described here, is a five-strand  sheet flanked by ␣ helices. The protein resembles replication initiator protein AAV-5 Rep but is circularly permuted, yielding a different topology. The  sheet forms a binding cleft lined with neutral, nonaromatic residues, unlike most single-stranded DNA binding proteins which use aromatic and charged residues. The cleft contains depressions, suggesting base recognition occurs in a knob-into-hole fashion. Unlike most nucleases, three histidines but no acidic residues coordinate a Mg2ⴙ located near the catalytic tyrosine. The full positive charge on the Mg2ⴙ and the architecture of the active site suggest multiple roles for Mg2ⴙ in DNA cleavage. Introduction During bacterial conjugation, a copy of a conjugative plasmid is transferred between bacteria (Frost et al., 1994; Zechner et al., 2000). Because conjugation can occur between species, it contributes both to the transfer of antibiotic resistance genes between bacteria and to diversification of prokaryotic genomes. Although conjugative plasmids show considerable sequence diversity, their transfer processes have many similarities, including unidirectional transfer of single-stranded DNA (ssDNA). Relaxase proteins, also called nickases, mobilization proteins, and transesterases, are responsible for binding and cleaving a region of plasmid DNA, in singlestranded form, in a sequence-specific manner prior to transfer. Relaxases belong to a broad family of proteins that cleave and ligate DNA during plasmid transfer or rolling circle replication of plasmids, bacteriophages, or viruses (Ilyina and Koonin, 1992; Koonin and Ilyina, 1993). This classification is based in part on the presence of a conserved “HUH” motif. In this motif, H is histidine, U is a hydrophobic residue, and the amino acids are organized consecutively. HUH proteins show considerable se*Correspondence:
[email protected] 1 These authors contributed equally to the work.
quence and organizational diversity. Some, including relaxases from F (Traxler and Minkley, 1988) and R388 plasmids (Grandoso et al., 1994) and Rep proteins from parvoviruses (Im and Muzyczka, 1990), contain C-terminal helicase activities. Other HUH proteins, including TraI of IncP plasmid RP4 and gene A protein of φ174, lack helicase activity, but do contain N- and/or C-terminal regions that may be involved in protein-protein interactions or have other functions important to DNA replication or transfer. The HUH motif is distinct from the HNH motif found in homing endonucleases and some colicins (Gorbalenya, 1994; Shub et al., 1994). HNH nucleases possess a motif containing two histidines and an asparagine located within a 30-amino acid stretch. Relaxases cleave DNA through a transesterification reaction involving a nucleophilic attack by a Tyr hydroxyl on the 5⬘ side of a DNA phosphate. The reaction generates a stable, long-lived phosphotyrosyl linkage and a free 3⬘ hydroxyl on the DNA. In F-factor transfer, the cleavage occurs in a site and strand-specific manner at the nic location within the origin of transfer (oriT) of the plasmid. The HUH motif has been proposed to play a role in coordination of divalent cations (Ilyina and Koonin, 1992), or in serving as a general base, abstracting the tyrosyl hydroxyl proton to assist the catalytic Tyr in its cleavage activity (Pansegrau et al., 1994). F TraI36, or the F relaxase domain, is an N-terminal 330-amino acid fragment of F factor TraI that possesses in vitro relaxase function (Street et al., 2003). The relaxase activities of the bifunctional F TraI and R388 TrwC proteins are linked to helicase activities (Abdel-Monem et al., 1983; Grandoso et al., 1994; Matson and Morton, 1991; Reygers et al., 1991; Sherman and Matson, 1994; Traxler and Minkley, 1988), and these two activities must be physically linked to permit efficient conjugative transfer (Llosa et al., 1996; Matson et al., 2001). The reason for this requirement is unclear. The relaxase activities of F TraI and R388 TrwC are located within regions approximately 300 amino acids long, a relatively small portion of the proteins that are 1756 and 966 amino acids long, respectively (Byrd et al., 2002; Llosa et al., 1996; Street et al., 2003). F TraI36 has a subnanomolar KD and a high level of sequence specificity for a ssDNA oligonucleotide containing an F factor oriT sequence (Stern and Schildbach, 2001). Single base changes over an 11-base region of this oligonucleotide can reduce affinity by up to three orders of magnitude. To gain insight into the ssDNA recognition and cleavage of TraI36, we crystallized the protein with bound Mg2⫹ (Larkin et al., 2003). Here we describe the three-dimensional structure of TraI36 to 2.6 A˚ resolution. The structure shares structural features with other HUH proteins of known structure but has a different topology and is circularly permuted relative to the other proteins. The ssDNA binding surface of TraI36 is composed largely of uncharged polar residues and small hydrophobic side chains instead of the charged and aromatic amino acids seen in other sequence-spe-
Structure 1370
Figure 1. Experimental and Refined Electron Density of a Selected Portion of the TraI36 Active Site The final refined structure of TraI36 in the region of metal binding cluster and the catalytic residues is overlaid on (left) the density-modified experimental and (right) a 2Fo ⫺ Fc refined electron density. The electron density is contoured at 1. The ligands to the octahedrally coordinated Mg2⫹ ion are His146, His157, His159, and three water molecules. Tyr16 and Tyr17, residues implicated in catalysis, are also labeled.
cific ssDNA binding proteins. The binding surface also features a number of pockets, suggesting that DNA bases bind in a knob-into-hole fashion. Finally, the bound Mg2⫹ is liganded by three histidine side chains. This unusual coordination does not neutralize the Mg2⫹, which has implications for the mechanism of ssDNA cleavage. Results and Discussion Overall Structure The structure of TraI36 was determined using Multiwavelength Anomalous Diffraction using the anomalous scattering of selenomethionine residues (see Experimental Procedures). Experimental and refined electron density for a region of the protein is shown in Figure 1. The TraI36 protein is a prolate ellipsoid (Figures 2 and 3). The center of the protein consists of a five-strand antiparallel  sheet (composed of strands 1, 2, 5, 8, and 9; Figure 3), the strands of which run parallel to the long axis of the ellipsoid. Located on one face of the  sheet are several residues contributing to relaxase function (discussed below). The opposite face of the  sheet packs against an ␣-helix (␣E) that runs parallel to the long axis of the protein. A pair of ␣ helices (␣F and ␣G), running antiparallel to ␣E, packs against both ␣E and one edge of the  sheet (1). Several of these longer secondary structure elements are linked by loop regions composed of a short helix (␣D, for example) or a twostrand antiparallel  sheet (3 and 4; 10 and 11; 12 and 13). TraI36 crystallized in space group P3121 with three molecules in the asymmetric unit. The electron density for molecules A and B was of generally higher quality than for molecule C, which is reflected in the lower mean B factors for molecules A and B relative to molecule C (Table 1). The maps were interpretable over the same relative regions for all three molecules, however, and
the average pair-wise root mean square deviations in C␣ positions were small for the different molecules (ⱕ0.8 A˚). Although molecules A and B are related by approximate 2-fold symmetry, there is little evidence consistent with dimer formation. The solvent accessible surface area buried by the interaction (ⵑ800 A˚2 per monomer; ⬍6% of total accessible surface area), the number of contact residues in the interface (25), and the gap volume index (7.18; all calculated using the Protein-Protein Interaction server [http://www.biochem.ucl.ac.uk/bsm/PP/server]) are all more consistent with nonbiological rather than biological crystal contacts (Jones and Thornton, 1996; Valdar and Thornton, 2001). Furthermore, TraI36 is monomeric in solution under all conditions tested (Street et al., 2003), suggesting that the observed crystal packing does not have functional significance. The first 235 amino acids of all three molecules in the asymmetric unit could be located in the electron density map. Starting at amino acid 236, weak electron density prevented us from tracing the chain. A second region, consisting of two large ␣ helices (␣I and ␣J) joined by a bend, starts at amino acid 268 and continues through 306. These two ␣ helices pack against one end of the protein, interacting with residues in the loops joining the strands and helices in the core domain. Beyond 306, the chain could not be traced. The poorly defined region between amino acids 235 and 268 correlates well with a trypsin-sensitive region of the TraI36 molecule (Street et al., 2003). A fragment consisting of amino acids 1–236 binds ssDNA poorly (Street et al., 2003). This is likely the result of loss of direct contacts between DNA and amino acids between 236 and 330. If so, the regions that are disordered in the structure may become better ordered upon binding DNA. Structurally Related Proteins Through DALI searches (Holm and Sander, 1993), we identified a number of proteins that have statistically
Sequence-Specific Single-Stranded DNA Nuclease 1371
Figure 2. Stereoview of TraI36 Relative positions of the N terminus (N), one Tyr in each of the two N-terminal Tyr pairs (Y16 and Y24), two residues involved in DNA recognition (Q193 and R201), and the beginning (V268) and end (G306) of the C-terminal helices are marked. The regions between residues 235 and 268, and between residues 306 and 330 were disordered.
significant structural similarity to TraI36. Of these, the best match in terms of structural (DALI Z-score ⫽ 3.5) and functional similarity is the Rep protein from Adenoassociated virus serotype 5 (AAV-5) (Hickman et al., 2002). Both AAV-5 Rep and TraI belong to the family of rolling circle replication and mobilization proteins, described by Koonin and Ilyina (Ilyina and Koonin, 1992; Koonin and Ilyina, 1993), that possess the HUH sequence motif. Structurally, AAV-5 Rep and TraI36 share
the five-strand  sheet and bordering ␣ helices (Figure 4) and also share approximate positions of catalytic Tyr side chains and metal binding clusters (discussed below). Prior to determining the structure of TraI36, we had tried to detect structural similarity between AAV-5 Rep and TraI36 based on similarities in sequence and predicted secondary structure. Our inability to do so was probably for two reasons. First, the secondary structural elements of AAV-5 Rep are joined by comparatively
Figure 3. Topology of the F Factor TraI Relaxase Domain Shown are (left and center) two faces of the protein, related by a 180⬚ rotation about the vertical axis.  strands are shown as blue arrows, helices are shown as yellow ribbons, and coil regions are shown as green lines. Also shown (right) is a schematic diagram of the protein fold. Strands (blue arrows labeled 1–13 starting from the N-terminal end) and helices (yellow cylinders labeled ␣A–␣J) were identified using DSSP (Kabsch and Sander, 1983). Coil regions are indicated by solid black lines, while regions that could not be traced are depicted as dashed black lines.
Structure 1372
Table 1. Data Collection and Refinement Statistics
Space group Unit cell lengths (A˚) Data collection1 Wavelength (A˚) Resolution range (A˚) Total observations Redundancy I/I Rmerge (%)2 Completeness (%) Phasing3 Overall Fig. of Merit Acentric Centric Phasing power Refinement Resolution (A˚) Rcryst Rfree R.m.s. dev. from ideal Bonds (A˚) Angles (⬚) No. of atoms/A.U. No. of amino acids/A.U. No. of waters/A.U. No. ethylene glycols/A.U. Mean B factor (A˚2) Mean B factor, mol. A (A˚2) Mean B factor, mol. B (A˚2) Mean B factor, mol. C (A˚2) Mean B factor, waters (A˚2) R.m.s. deviation, B factor Main chain, bonded Side chain, bonded
Native (CHESS)
SeMet (CHESS)
Native (Cu K␣)
P3121 a ⫽ b ⫽ 128.2, c ⫽ 121.2
P3121 a ⫽ b ⫽ 127.5, c ⫽ 119.7
P3121 a ⫽ b ⫽ 128.1, c ⫽ 120.8
0.9713 25–2.6 (2.76–2.60) 34,650 11.2 40.0 (8.7) 5.4 (33.7) 96.8 (92.3)
0.9789 (peak) 25–3.0 (3.11–3.0) 22,667 10.8 17.1 (2.0) 10.8 (–) 100 (100)
1.5418 25–3.0 (3.21–3.0) 21,150 6.7 25.8 (6.8) 7.6 (34.3) 99.9 (100)
0.4961 (20,363) 0.3085 (1,964) 1.87 2.6 0.233 0.278 0.007 1.2 6,656 819 158 11 67.3 48.1 52.2 103.1 50.2 1.59 2.19
1
Numbers in parentheses are for highest resolution bin. Rmerge ⫽ ⌺|I ⫺ ⬍I⬎|/⌺ (I) 3 Figure of merit and phasing power calculated by SHARP. Number of reflections are shown in parentheses. 2
small loop regions whereas the analogous elements of TraI36 are often joined by loop regions that themselves contain short helices or  sheets. Second, the shared secondary structural elements are arranged differently in the two proteins. In both proteins, the two outer strands of the  sheet are separated in sequence by a helix containing the catalytic residues and are followed by a second helix. In TraI36, however, these elements start at the N terminus of the protein and are followed by the three core strands. In Rep, the three core strands are N-terminal to the outer strands and helices. In other words, TraI36 is circularly permuted relative to AAV-5 Rep. Both the different order of secondary structural elements and the additional secondary structural elements found in TraI36 but not AAV-5 Rep obscure the structural relationship between these proteins on the sequence level. In DALI alignments, TraI36 does not show significant structural similarity (Z-score ⫽ 1.4; ⱖ2 considered significant) to the tomato yellow leaf curl virus (TYLCV) Rep (Campos-Olivas et al., 2002). This result comes despite TYLCV Rep possessing a five-sheet antiparallel  sheet (Figure 4) and participating in rolling circle replication. Furthermore, both TYLCV Rep and AAV-5 Rep are members of a diverse group of structurally related proteins,
identified by statistically significant DALI Z-scores, involved in various aspects of nucleic acid metabolism in viruses, prokaryotes, and eukaryotes (Campos-Olivas et al., 2002). TYLCV Rep and AAV-5 Rep share a similar topology (Figure 4) but show little sequence identity to each other or TraI36. TYLCV Rep has the same  sheet topology as AAV-5 Rep but lacks the extended helices found in both AAV-5 Rep and TraI36. Perhaps the differences in topology and in the length and number of helical regions cause the structural similarity between TYLCV Rep and TraI36 to fall short of statistical significance, while similarities between AAV-5 Rep and TraI36, and between AAV-5 Rep and TYLCV Rep are found. Another protein showing relatively high structural similarity (DALI Z-score ⫽ 3.8) to TraI36 is the E. coli biotin repressor BirA (Weaver et al., 2001; Wilson et al., 1992). BirA binds to the biotin operator to repress transcription of the biotin biosynthetic operon. BirA also catalyzes the formation of biotin adenylate and the transfer of biotin from biotin adenylate to biotin carboxyl carrier protein. BirA and TraI36 share the antiparallel  sheet and helices that pack against it. The similarity, however, occurs with the BirA C-terminal biotin binding catalytic region, rather than its N-terminal double-stranded DNA binding domain. As was the case with AAV-5 Rep, the
Sequence-Specific Single-Stranded DNA Nuclease 1373
Figure 4. Comparison of Folds of HUH Proteins Shown are (left) TraI36 (PDB accession code 1P4D), (center) Rep protein from AAV-5 (Hickman et al., 2002) (PDB accession code 1M55), and (right) Rep protein from tomato yellow curl leaf virus (Campos-Olivas et al., 2002) (PDB accession code 1L2M). Structural similarities between TraI36 and AAV-5 Rep, and between AAV-5 Rep and TYLCV Rep were identified using DALI (Holm and Sander, 1993). Below each structure is shown the topology of its five-strand  sheet, including the relative orientations of its N and C termini. TraI36 is circularly permuted relative to the other two proteins.
sequence order of the strands that form the  sheet is not conserved between BirA and TraI36. DNA Binding The relaxase active site including a Mg2⫹ binding cluster (described in detail below) is located near an edge of one face of the protein. A cleft ranging in width from 11–15 A˚ extends from the catalytic site across the face of the protein (Figures 5 and 6). This cleft represents much of the DNA binding surface of the protein, based on the number of positions in the cleft at which substitutions cause reductions in affinity (Figure 5 and below). For example, single Ala substitutions for Ser149, Glu153, or Gln155 each reduce binding by a minimum of 15-fold (M.J. Harley and J.F.S., unpublished data). The ssDNA recognition of TraI36 is asymmetric relative to the nic site, with the majority of sequence-specific recognition occurring with bases 5⬘ to nic (Stern and Schildbach,
2001). Therefore, bound ssDNA should be oriented with nic at the active site, extending toward its 5⬘ end through the cleft, across the face of the molecule. Consistent with this, Arg150, which has been shown to interact with bases 3⬘ to nic (Harley et al., 2002), is located near the catalytic site (Figures 5 and 6). The cleft has two striking features. The first of these is the amino acid composition of the surface. Excluding the amino acids involved in metal coordination and DNA cleavage, there are no aromatic side chains along the surface of the cleft. This is surprising given the frequently observed stacking of aromatic side chains against DNA bases in structures of complexes of proteins and ssDNA. This stacking is seen both in sequence-specific proteins such as Oxytricha nova telomere end binding protein (OnTEBP) (Peersen et al., 2002) and nonspecific proteins such as E. coli single-stranded DNA binding protein (Raghunathan et al., 2000). These stacking interactions
Structure 1374
Figure 5. DNA Binding Cleft and Residues Important to DNA Recognition The secondary structure (left) and molecular surface representation (right) of the region of the protein including the cleft is shown. Tyr16 is necessary for catalysis but not binding (Street et al., 2003). Arg8 is located near Tyr16 and may play a role in both binding and catalysis, but this has not been demonstrated. Arg150 contributes to ssDNA recognition 5⬘ to nic (Harley et al., 2002). All other residues depicted have been shown to contribute to ssDNA recognition (Harley and Schildbach, in preparation).
appear to contribute enormously to high-affinity ssDNA binding by S. cerevisiae Cdc13 protein (Anderson et al., 2003; Mitton-Fry et al., 2002). The binding cleft of TraI36 also has relatively few charged amino acid side chains, and those present tend to be solvent exposed. The electrostatic potential at the TraI36 surface was calculated by the Poisson-Boltzmann equation as implemented in SPOCK (Christopher, 1997). The map, scaled so that maximal positive and negative values of the potential (45 kT and ⫺45 kT) are contoured to the most intense blue and red colors, respectively, is shown in Figure 6. The greatest electrostatic potentials are located within the binding cleft, but these appear as discreet areas of both positive and negative charge. In contrast, AAV-5 Rep has a positive electrostatic potential throughout its binding cleft (Hickman et al., 2002). KH domains show a continuous band of positive electrostatic potential in their binding cleft, where the protein interacts with the DNA phosphoribose backbone (Braddock et al., 2002). In TraI36, one region of positive potential is associated with Arg201, which plays a role in binding specificity (see below). The second is associated with the bound Mg2⫹, which is essen-
Figure 6. Electrostatic Potential Is Greatest in the DNA Binding Cleft The electrostatic potential was calculated using the Poisson-Boltzmann equation as implemented in SPOCK (Christopher, 1997) and displayed on a molecular surface representation of the protein. The contouring was set so that the most intense red and blue colors correspond to the greatest negative (⫺45 kT) and positive (45 kT) values of the calculated potential. The most negative potential values are in the region of Asp81 and near Glu216 and Glu225. The most positive potential values are in the region of the histidinecoordinated Mg2⫹ ion and Arg201. The protrusion over the cleft is the side chain of Arg150.
tial for cleavage and may help position the sissile phosphate. One region of negative potential is associated with Asp 81, which may help position the DNA bases near nic for cleavage. Most charged regions in TraI36 thus appear to have roles in DNA binding or cleavage. The amino acids lining the TraI36 binding cleft are mostly nonaromatic hydrophobic residues and uncharged polar residues. The reliance upon direct contacts between these amino acids and DNA suggests that a high degree of surface complementarity is a prerequisite for high affinity sequence-specific binding by TraI36. In contrast, other ssDNA binding proteins frequently use amino acids that can contribute potentially strong, long-range charge-charge interactions or the less specific stacking of the planar surfaces of aromatic side chains on DNA bases. The second striking aspect of the TraI36 binding cleft is the presence of at least three depressions or pockets within the cleft. Most other sequence-specific ssDNA binding proteins have comparatively smooth binding surfaces. One exception is OnTEBP, which has a pocket that can accommodate a DNA base (Peersen et al., 2002). For OnTEBP, however, portions of this binding pocket are formed by other DNA bases and the DNA backbone packing against the base sitting in the pocket. In contrast, the walls of the pockets in TraI36 are entirely formed by the protein. The size of these pockets in TraI36 (7–10 A˚ in diameter; 4–8 A˚ deep) is similar to the size of the pocket observed in OnTEBP, suggesting that the pockets in TraI36 can accommodate at least portions of DNA bases in a knob-into-hole fashion. Binding in this fashion would allow considerable shape complementarity, thereby helping to determine binding specificity. One of the pockets in TraI36 is formed on opposite sides by the side chains of Gln193 and Arg201, which are separated by 9 A˚ (Figures 5 and 7). The other sides are formed by Ile194 and Met1, located about 8 A˚ apart, and the bottom of the pocket is formed by Gly197. Amino acid substitutions for Gln193 and Arg201, both located within the cleft, can reduce affinity by ⱖ3 orders of magnitude (Harley and Schildbach, 2003). These two amino acids differ between F TraI and the TraI protein from F-like plasmid R100, which has an Arg193 instead of Gln, and Gln201 instead of Arg. The plasmid DNA sequences recognized by F and R100 TraI differ by two bases out of 11, and each protein binds its cognate ssDNA site with at least 1000-fold higher affinity than the other site. Replacing the residues at positions 193
Sequence-Specific Single-Stranded DNA Nuclease 1375
Figure 7. A Pocket within the Binding Surface of TraI36 Can Accommodate a DNA Base Shown is a model of Gua (shown as sticks) docked into a pocket of TraI36, shown as a molecular surface. The base fits well into the pocket, with little overlap in the van der Waals surfaces of the protein and the base, even without energy minimization of the complex or rearrangement of the protein or base. The surfaces of Arg201 (left) and Gln193 (right) are shown with carbons colored green, nitrogens colored blue, and oxygens colored red. The remainder of the binding pocket is largely hydrophobic. In this orientation, a hydrogen bond can be formed between the Gua O6 and the Arg side chain.
and 201 with those from R100 TraI confers an R100-like binding specificity on F TraI36 (Harley and Schildbach, 2003). One of the differences between the oriT DNA sequences is a Gua in one position in F versus an Ade in the equivalent position R100. A Gua can be docked into the pocket defined by Gln193 and Arg201 in the F TraI36 structure in an orientation that allows excellent surface complementarity without rearrangements in the protein structure (Figure 7). When docked in this manner, the O6 of Gua is orientated to interact favorably with the Arg201 of F. In the same relative orientation, a bound Ade would have its N6 atom located to interact favorably with the Gln201 of R100 (data not shown), in part explaining the specificity difference between these two proteins. Regardless of the precise basis of specificity, there is a strong correlation between binding specificity and the pocket lined with Gln193 and Arg201. The other binding pockets are also likely to play a significant role in determining binding specificity. Catalytic Site Magnesium is required for the catalytic activity of relaxases (Byrd and Matson, 1997), but not for high-affinity binding of TraI36 (Stern and Schildbach, 2001). We interpreted a region of strong electron density near the catalytic site as a Mg2⫹ ion (Figure 8). This conclusion was based on the observed coordination geometry (described below) and on results from an anomalous difference Fourier calculated from a data set collected using a Cu rotating anode source (1.5418 A˚; Table 1). Peaks of 2–4 corresponding to S atoms were observed in the region of the metal binding cluster in the anomalous difference Fourier map (data not shown). No peak was observed for the metal. This is consistent with identification of the bound ion as Mg2⫹ because Mg2⫹ shows little anomalous scattering at this wavelength (f″ ⫽ 0.18)
relative to S (f″ ⫽ 0.56). We would expect a peak greater than those for the S atoms if the metal binding site were occupied by Sr (f″ ⫽ 1.82; SrCl2 was a component of the crystallization solution [Larkin et al., 2003] ), or other biologically relevant metals such as Mn2⫹ (f″ ⫽ 2.79), Ca2⫹ (f″ ⫽ 1.30), Zn2⫹ (f″ ⫽ 0.69), or Ni2⫹ (f″ ⫽ 0.52). In molecules A and B, the metal ion has the octahedral coordination geometry expected with Mg2⫹ coordination. Three ligands are provided by protein side chains, including the N␦1 of His146, and the N⑀2 atoms of His157 and 159 (Figure 8). This pattern of histidine coordination requires two of the three histidines to be in the N␦H tautomeric form. Although in unstructured histidyl residues this is disfavored, the bias is small (⌬G⬚ⱕ0.7 kcal/ mol for tautomerization [Barrick, 2000]), indicating that the observed ligation pattern is energetically accessible. Consistent with this, several N⑀2 ligands have been observed, including the participation of two His N⑀2 atoms in the coordination of Zn2⫹ in the related AAV-5 Rep protein (Hickman et al., 2002). Furthermore, the N␦1 of His157 serves as a hydrogen bond donor to the backbone carbonyl of Gln155, and the N␦1 of His159 serves as a hydrogen bond donor of the Asp81 side chain carboxylate, consistent with the N⑀2 atoms of these side chains being unprotonated and capable of ligating Mg2⫹. In molecule A, water molecules occupy the remaining three ligand positions. In molecules B and C, however, there is density for only one liganded water molecule. In molecule B, this water is in the ligand position opposite His146 N␦1. In molecule C, the water is also located approximately opposite His146 N␦1. The geometry in molecule C, however, appears to be distorted away from an octahedral arrangement toward a more tetrahedrallike arrangement, although this probably reflects relatively weak density preventing precise positioning of the water rather than a difference in coordination. No other bound metal ions were identified. The absence of other metal ions and the proximity of the observed Mg2⫹ ion to a catalytic residue (Figure 8) suggests that this single structurally well-defined Mg2⫹ is involved in TraI nucleolytic activity. Consistent with the importance of Mg2⫹ for relaxase function, TraI36 variants having Ala replacing either His157 or His159 have greatly reduced in vitro ssDNA cleavage activity (M.J. Harley and J.F.S., unpublished data). Substitution of Ala for His146 does not affect DNA binding or apparent cleavage activity (M.J. Harley and J.F.S., unpublished data), despite its participation in coordinating Mg2⫹. In the H146A variant, however, Mg2⫹ coordination may be maintained by Asn22, which is located near His146 and the bound metal. By adopting a different side chain rotamer position than is seen in the wild-type structure, Asn22 could place its O␦1 atom ⬍1.5 A˚ from the His146 N␦1. If Asn22 adopts this position in the H146A mutant, the mutant may still coordinate Mg2⫹ and function as wild-type. Because three neutral His side chains coordinate its metal, the charge of the bound Mg2⫹ in TraI36 is not neutralized. This contrasts with most metal-dependent nucleases, including AAV-5 Rep, which use at least one acidic side chain. Although common in Zn and Cu binding proteins, coordination of Mg2⫹ and Mn2⫹ by three His side chains is rare. A search of the Metalloprotein
Structure 1376
Figure 8. The bound Mg2⫹ Is Coordinated by Three His Side Chains and Water Molecules Shown is the bound Mg2⫹ from molecule A in the asymmetric unit. The Mg2⫹ is coordinated by His146 N␦1, His157 N⑀2, His159 N⑀2, and three water molecules. Only one Mg2⫹-coordinated water molecule could be positioned in molecules B and C (see text). One water molecule hydrogen bonds with Tyr16, and another is a hydrogen bond donor to the carbonyl of Thr148. Distances between atoms are: A, 2.95 A˚ (Tyr16 OH-water); B, 2.25 A˚ (water-Mg2⫹); C, 2.37 A˚ (His159 N⑀2-Mg2⫹); D, 2.02 A˚ (water-Mg2⫹); E, 2.28 A˚ (His157 N⑀2Mg2⫹); F, 1.92 A˚ (water-Mg2⫹); G, 3.24 A˚ (water-Thr148 O); H, 2.57 A˚ (His146 N␦1-Mg2⫹).
Database and Browser (MDB) (Castagnetto et al., 2002) failed to yield any other proteins with bound Mg2⫹ or Mn2⫹ solely coordinated by three His residues. Those proteins having bound Mg2⫹ or Mn2⫹ coordinated by three His side chains also have a fourth ligand provided by an acidic side chain. The most frequently occurring examples of these proteins in the database are species and sequence variants of manganese superoxide dismutase. A structure of colicin E7, an HNH protein, was obtained using data from a crystal soaked with Mn2⫹ (Sui et al., 2002). In this structure, which has not been submitted to the PDB and therefore was not identified in our MDB search, a bound Mn2⫹ ion is seen coordinated by three His side chains in a location occupied by Zn2⫹ ion in the original structure (Ko et al., 1999). Both colicin E7 and the related colicin E9 require divalent cations for their nuclease activities, and Mn2⫹ serves well in this role. Other data, however, suggests that in the cell a Zn2⫹ ion is bound by these colicins at this three-His site, serving a structural rather than catalytic function (Ku et al., 2002; Pommer et al., 1999; Sui et al., 2002). The extent of the similarities between these colicins and TraI in active site structure, metal usage, and function is still unclear. Substitution of Phe for either Tyr16 or Tyr17 dramatically reduces in vitro ssDNA catalysis by TraI36 but does not alter binding affinity (Street et al., 2003). One Tyr side chain likely provides the oxygen nucleophile for cleavage, or the two somehow cooperate to carry out cleavage. The aromatic side chains of these two amino acids interact through edge-to-face ring stacking. The distance between the Tyr hydroxyls (5.9 A˚; average of the three molecules in the asymmetric unit) and the absence of residues or water molecules positioned to interact with both Tyr side chains, suggests that it is unlikely that both Tyr hydroxyls contribute directly to the catalytic mechanism. (This assumes that there is no dramatic structural rearrangement in the active site upon binding ssDNA. Although not expected, such rearrangements are possible since N-terminal to the Tyr residues is a region [positions 9–14] with a sequence [SerAlaGlySerAlaGly] that may confer considerable flexibility on the region.) Tyr17 is partially buried, and its hydroxyl group less accessible and further away from
the bound Mg2⫹ than that of Tyr16 (8.2 A˚ versus 4.8 A˚), making Tyr17 a less likely catalytic residue. If Tyr17 is not the catalytic residue, the reason for the reduced activity when Tyr17 is replaced by Phe is not obvious. It may result from a minor rearrangement of side chains within the active site caused by, for example, loss of the hydrogen bond between the Tyr17 hydroxyl and the Asp81 side chain. The Tyr16 hydroxyl does not directly interact with the bound Mg2⫹ but forms a hydrogen bond with one of the Mg2⫹-coordinated water molecules. No other interactions between the Tyr16 hydroxyl and waters or amino acids are observed. The catalytic hydroxyl might be expected to interact with a general base that would help position the Tyr16 hydroxyl and make it more nucleophilic. In this case, Mg2⫹ could act to lower the pKa of the water molecule bridging the metal ion and the Tyr16 hydroxyl, permitting generation of a hydroxide ion that could then act as a general base (Galburt and Stoddard, 2002). The charge on Mg2⫹ would help to stabilize the hydroxide ion. The feasibility of this mechanism is increased by the His liganding of the Mg2⫹, which does not neutralize the Mg2⫹ charge, allowing a greater effect on the pKa of the water molecule. This mechanism is reminiscent of the role of the His-liganded Zn in carbonic anhydrase, although the hydroxide ion in carbonic anhydrase is a catalytic nucleophile rather than a general base (Christianson and Cox, 1999). An alternative source for a general base during TraI36 DNA cleavage might be a nonbridging phosphate oxygen, if it were properly oriented. The bound Mg2⫹ may also assist in orienting the ssDNA substrate for catalysis. Phosphate oxygens of bound ssDNA might occupy the positions of one or more Mg2⫹liganded water molecules, thus positioning the phosphorus for a nucleophilic attack by the Tyr16 hydroxyl oxygen. Assuming that catalysis occurs by an in-line mechanism, the 3⬘ oxygen at nic would be distal to the catalytic Tyr. The Mg2⫹ ion may also stabilize a ⫺2 charge of a phosphoanion transition state, as has been proposed for a number of endonucleases (Galburt and Stoddard, 2002). Again, the Mg2⫹ should be assisted in this role by its unneutralized positive charge. The Arg8 side chain is located near Tyr16. Arg8 may contribute
Sequence-Specific Single-Stranded DNA Nuclease 1377
to catalysis by donating a proton to the leaving group or by contacting and positioning the ssDNA substrate, although the effects of substitutions at this position have not yet been determined. In addition to the first pair of tyrosines at positions 16 and 17, F TraI has a second pair at positions 23 and 24 (Figure 2). For TrwC, the relaxase of IncW conjugative plasmid R388, initiation and termination of transfer is proposed to require sequential action by the equivalent of Tyr16 and Tyr23 (Grandoso et al., 2000). A similar twoTyr mechanism has been proposed for other proteins, including RP4 TraI (Pansegrau and Lanka, 1996), A protein from bacteriophage P2 (Odegrip and Haggard-Ljungquist, 2001), and the gene A protein of φ174 (Hanai and Wang, 1993). In F TraI36, Tyr16 and Tyr23 are located on opposite faces of the protein (Figure 2). This raises the possibility that the sequential activity involves an interaction between protein and DNA first on one, then on the other, face of the protein. By segregating the Tyr residues on opposite faces of the protein, the second Tyr residue will be accessible even while the first directly interacts with bound ssDNA. This may be a requirement for protein activity, as the first Tyr forms a long-lived covalent intermediate with DNA 3⬘ to nic and is proposed to ligate the ends of the plasmid to terminate transfer. If second-strand synthesis extends from nic during transfer, a free 3⬘ hydroxyl would have to be generated at nic to allow for proper circularization of the plasmid and regeneration of the nic site. This free hydroxyl could, in principle, be generated by Tyr23. There is, however, no observed bound metal or obvious DNA binding surface near Tyr23. If TraI activity required sequential activity by Tyr16 and Tyr23, this would require either substantial structural rearrangement or significantly different modes of DNA recognition and catalysis in the catalytic steps involving the two residues. Experimental Procedures TraI36 protein was purified (Street et al., 2003) and crystallized (Larkin et al., 2003) as described. Synchrotron diffraction data sets were collected at CHESS F-2 beamline (Cornell High Energy Synchrotron Source, Ithaca, NY) using an Area Detector Systems CCD Quantum210 detector (Szebenyi et al., 1997) and an Oxford Cryosystems cryostat. Data were indexed, processed, and scaled using the HKL processing package (Otwinowski and Minor, 1997). Experimental phases were derived by the MAD method using the SeMet data set. Redundant data sets were collected at the Se absorption edge (0.9789 A˚), the inflection (0.9793 A˚), and at a remote wavelength (0.9713 A˚). Se sites (39 of 39) in the asymmetric unit were located with the program SnB v. 2.2 (Weeks and Miller, 1999) using the Se edge data. The data were scaled, normalized, and filtered using the DREAR suite of programs (Blessing et al., 1998; Blessing and Smith, 1999). Trial solutions showed a bimodal distribution (Rbest ⫽ 0.43 versus Rworst ⫽ 0.53) with 2 of 575 solutions being correct. The Se sites for the best solution were refined using the program SHARP v. 2.0 (de la Fortelle and Bricogne, 1997). Protein phases were calculated with SHARP using all three data sets, and the electron density map was improved by a combination of solvent flipping using Solomon (Abrahams and Leslie, 1996) and density modification using dm (Cowtan, 1994), both as implemented in SHARP. Maps were inspected and a skeletonized version of the map was built using MAPMAN (Kleywegt and Jones, 1996) to facilitate interpretation. Model building was performed using TURBO-FRODO (Roussel and Cambillau, 1991). A partial protein model was built into the region of the asymmetric unit with the strongest density. Subsequently two noncrystallo-
graphically symmetry related copies of this model were manually positioned in the asymmetric unit. Initially, the model was refined to 3.0 A˚ against the experimental phases. The test data set for calculation of Rfree included 10% of the data. Rigid body, positional, and simulated annealing refinement and overall B factor refinement for each molecule were performed using CNS (Brunger et al., 1998). Following initial rigid body refinement, real space transformation matrices were determined using CNS, and strict noncrystallographic symmetry (NCS) restraints were used in initial rounds of refinement. After the first few rounds of model building and refinement, experimental phases were combined with phases calculated from the model. After several rounds, the model was refined against the native data set to 2.6 A˚ resolution using phases calculated from the model. Prior to refinement against the native data set, the test data set for Rfree calculation was extended to include the higher resolution data. Grouped B factors for each amino acid main chain and side chain were refined using CNS. Strict NCS was replaced by NCS restraints, and NCS restraints were lifted in final rounds of refinement. After restrained individual B factor refinement, water and ethylene glycol molecules were placed. The final model includes 819 amino acids with an alternate conformation for Arg115 in molecule A, 158 water molecules, and 11 ethylene glycol molecules. In the final model, 84.6% of nonproline, nonglycine residues were in most-favored regions of a Ramachandran plot, 14.8% in additional allowed regions, 0.6% in generously allowed regions, and none in disallowed regions, as analyzed using PROCHECK (Laskowski et al., 1993). To test for anomalous scattering of Cu K␣ radiation by the Hisbound ion, a data set was collected using a RU-H3RHB generator outfitted with a rotating copper anode and an R-AXIS IV⫹⫹ imaging plate system (RigakuMSC). Data were indexed, processed, and scaled using the HKL processing package. An anomalous difference Fourier was calculated using CNS. Figure 6 was prepared using SPOCK (Christopher, 1997), calculated with 230 mM salt, and contoured so that maximum color intensity was at 45 kT and ⫺45 kT for blue and red, respectively, and rendered using Raster3D (Merritt and Bacon, 1997). The topology schematic in Figure 3 was generated using Adobe Illustrator 10.0. All other figures were prepared and rendered using PyMOL (DeLano, 2002). Color conversion, sizing, and labeling were performed using Adobe Photoshop 6.0. This manuscript is based on work supported by National Institutes of Health grant GM61017. We thank Dan Leahy for helpful discussions, Doug Barrick for insightful comments on the manuscript, and Mark Zweifel for help with electrostatic potentials calculations. Brian Couts and Rachel Heiman were invaluable in preparation of the figures. Cu K␣ data were collected on an instrument purchased in part with National Science Foundation Multi-User Equipment Grant 9970207. Synchrotron data were collected at the Cornell High Energy Synchrotron Source (CHESS), which is supported by the National Science Foundation under award DMR 97-13424, using the Macromolecular Diffraction at CHESS (MacCHESS) facility, which is supported by award RR-01646 from the National Institutes of Health, through its National Center for Research Resources. We thank David Schuller, Bill Miller, Ernie Fontes, and Tony Lloyd for assistance in data collection and analysis at CHESS. Received: May 21, 2003 Revised: June 27, 2003 Accepted: July 8, 2003 Published online: October 8, 2003 References Abdel-Monem, M., Taucher-Scholz, G., and Klinkert, M.Q. (1983). Identification of Escherichia coli DNA helicase I as the traI gene product of the F sex factor. Proc. Natl. Acad. Sci. USA 80, 4659–4663. Abrahams, J.P., and Leslie, A.G.W. (1996). Methods used in the structure determination of bovine mitochondrial F-1 ATPase. Acta Crystallogr. D 52, 30–42. Anderson, E.M., Halsey, W.A., and Wuttke, D.S. (2003). Site-directed mutagenesis reveals the thermodynamic requirements for single-
Structure 1378
stranded DNA recognition by the telomere-binding protein Cdc13. Biochemistry 42, 3751–3758. Barrick, D. (2000). Trans-substitution of the proximal hydrogen bond in myoglobin: II. Energetics, functional consequences, and implications for hemoglobin allostery. Proteins 39, 291–308. Blessing, R.H., Guo, D.Y., and Langs, D.A. (1998). Intensity statistics and normalization. In Direct Methods for Solving Macromolecular Structures, S. Fortier, ed. (Dordrecht, Netherlands: Kluwer Academic Publishers), pp. 44–71. Blessing, R.H., and Smith, G.D. (1999). Difference structure-factor normalization for heavy-atom or anomalous-scattering substructure determinations. J. Appl. Crystallogr. 32, 664–670. Braddock, D.T., Louis, J.M., Baber, J.L., Levens, D., and Clore, G.M. (2002). Structure and dynamics of KH domains from FBP bound to single-stranded DNA. Nature 415, 1051–1056. Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., et al. (1998). Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr. D 54, 905–921. Byrd, D.R., and Matson, S.W. (1997). Nicking by transesterification: the reaction catalysed by a relaxase. Mol. Microbiol. 25, 1011–1022. Byrd, D.R., Sampson, J.K., Ragonese, H.M., and Matson, S.W. (2002). Structure-function analysis of Escherichia coli DNA helicase I reveals nonoverlapping transesterase and helicase domains. J. Biol. Chem. 277, 42645–42653. Campos-Olivas, R., Louis, J.M., Clerot, D., Gronenborn, B., and Gronenborn, A.M. (2002). The structure of a replication initiator unites diverse aspects of nucleic acid metabolism. Proc. Natl. Acad. Sci. USA 99, 10310–10315. Castagnetto, J.M., Hennessy, S.W., Roberts, V.A., Getzoff, E.D., Tainer, J.A., and Pique, M.E. (2002). MDB: the Metalloprotein Database and Browser at The Scripps Research Institute. Nucleic Acids Res. 30, 379–382. Christianson, D.W., and Cox, J.D. (1999). Catalysis by metal-activated hydroxide in zinc and manganese metalloenzymes. Annu. Rev. Biochem. 68, 33–57. Christopher, J.A. (1997). SPOCK: The Structural Properties Observation and Calculation Kit (College Station, TX: The Center for Macromolecular Design, Texas A&M University). Cowtan, K. (1994). dm: an automated procedure for phase improvement by density modification. Joint CCP4 and ESF-EACBM Newsletter on Protein Crystallography 31, 34–38. de la Fortelle, E., and Bricogne, G. (1997). Maximum-likelihood heavy-atom parameter refinement for multiple isomorphous replacement and multiwavelength anomalous diffraction methods. In Methods in Enzymology, Vol. 276, Macromolecular Crystallography, Part A. C.W. Carter and R.M. Sweet, eds. (San Diego, CA: Academic Press), pp. 472–494. DeLano, W.L. (2002). The PyMOL Molecular Graphics System (San Carlos, CA: DeLano Scientific). Frost, L.S., Ippen-Ihler, K., and Skurray, R.A. (1994). Analysis of the sequence and gene products of the transfer region of the F sex factor. Microbiol. Rev. 58, 162–210. Galburt, E.A., and Stoddard, B.L. (2002). Catalytic mechanisms of restriction and homing endonucleases. Biochemistry 41, 13851– 13860. Gorbalenya, A.E. (1994). Self-splicing group I and group II introns encode homologous (putative) DNA endonucleases of a new family. Protein Sci. 3, 1117–1120. Grandoso, G., Avila, P., Cayon, A., Hernando, M.A., Llosa, M., and de la Cruz, F. (2000). Two active-site tyrosyl residues of protein TrwC act sequentially at the origin of transfer during plasmid R388 conjugation. J. Mol. Biol. 295, 1163–1172. Grandoso, G., Llosa, M., Zabala, J.C., and de la Cruz, F. (1994). Purification and biochemical characterization of TrwC, the helicase involved in plasmid R388 conjugal DNA transfer. Eur. J. Biochem. 226, 403–412.
Hanai, R., and Wang, J.C. (1993). The mechanism of sequencespecific DNA cleavage and strand transfer by phi X174 gene A* protein. J. Biol. Chem. 268, 23830–23836. Harley, M.J., and Schildbach, J.F. (2003). Swapping single-stranded DNA sequence specificities of relaxases from conjugative plasmids F and R100. Proc. Natl. Sci. USA 100, 11243–11248. Harley, M.J., Toptygin, D., Troxler, T., and Schildbach, J.F. (2002). R150A mutant of F TraI relaxase domain: reduced affinity and specificity for single-stranded DNA and altered fluorescence anisotropy of a bound labeled oligonucleotide. Biochemistry 41, 6460–6468. Hickman, A.B., Ronning, D.R., Kotin, R.M., and Dyda, F. (2002). Structural unity among viral origin binding proteins: crystal structure of the nuclease domain of adeno-associated virus Rep. Mol. Cell 10, 327–337. Holm, L., and Sander, C. (1993). Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138. Ilyina, T.V., and Koonin, E.V. (1992). Conserved sequence motifs in the initiator proteins for rolling circle DNA replication encoded by diverse replicons from eubacteria, eucaryotes and archaebacteria. Nucleic Acids Res. 20, 3279–3285. Im, D.S., and Muzyczka, N. (1990). The AAV origin binding protein Rep68 is an ATP-dependent site-specific endonuclease with DNA helicase activity. Cell 61, 447–457. Jones, S., and Thornton, J.M. (1996). Principles of protein-protein interactions. Proc. Natl. Acad. Sci. USA 93, 13–20. Kabsch, W., and Sander, C. (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637. Kleywegt, G.J., and Jones, T.A. (1996). xdlMAPMAN and xdlDATAMAN—programs for reformatting, analysis and manipulation of biomacromolecular electron-density maps and reflection data sets. Acta Crystallogr. D 52, 826–828. Ko, T.P., Liao, C.C., Ku, W.Y., Chak, K.F., and Yuan, H.S. (1999). The crystal structure of the DNase domain of colicin E7 in complex with its inhibitor Im7 protein. Structure 7, 91–102. Koonin, E.V., and Ilyina, T.V. (1993). Computer-assisted dissection of rolling circle DNA replication. Biosystems 30, 241–268. Ku, W.Y., Liu, Y.W., Hsu, Y.C., Liao, C.C., Liang, P.H., Yuan, H.S., and Chak, K.F. (2002). The zinc ion in the HNH motif of the endonuclease domain of colicin E7 is not required for DNA binding but is essential for DNA hydrolysis. Nucleic Acids Res. 30, 1670–1678. Larkin, C., Datta, S., Nezami, A., Dohm, J.A., and Schildbach, J.F. (2003). Crystallization and preliminary X-ray characterization of the relaxase domain of F factor TraI. Acta Crystallogr. D, in press. Laskowski, R.A., Macarthur, M.W., Moss, D.S., and Thornton, J.M. (1993). Procheck - a Program to Check the Stereochemical Quality of Protein Structures. J. Appl. Crystallogr. 26, 283–291. Llosa, M., Grandoso, G., Hernando, M.A., and de la Cruz, F. (1996). Functional domains in protein TrwC of plasmid R388: dissected DNA strand transferase and DNA helicase activities reconstitute protein function. J. Mol. Biol. 264, 56–67. Matson, S.W., and Morton, B.S. (1991). Escherichia coli DNA helicase I catalyzes a site- and strand-specific nicking reaction at the F plasmid oriT. J. Biol. Chem. 266, 16232–16237. Matson, S.W., Sampson, J.K., and Byrd, D.R. (2001). F plasmid conjugative DNA transfer: the traI helicase activity is essential for DNA strand transfer. J. Biol. Chem. 276, 2372–2379. Merritt, E.A., and Bacon, D.J. (1997). Raster3D: photorealistic molecular graphics. Methods Enzymol. 277, 505–524. Mitton-Fry, R.M., Anderson, E.M., Hughes, T.R., Lundblad, V., and Wuttke, D.S. (2002). Conserved structure for single-stranded telomeric DNA recognition. Science 296, 145–147. Odegrip, R., and Haggard-Ljungquist, E. (2001). The two active-site tyrosine residues of the a protein play non-equivalent roles during initiation of rolling circle replication of bacteriophage p2. J. Mol. Biol. 308, 147–163. Otwinowski, Z., and Minor, W. (1997). Processing of X-ray diffraction
Sequence-Specific Single-Stranded DNA Nuclease 1379
data collected in oscillation mode. Methods Enzymol. A 276, 307–326.
Accession Numbers
Pansegrau, W., and Lanka, E. (1996). Mechanisms of initiation and termination reactions in conjugative DNA processing. Independence of tight substrate binding and catalytic activity of relaxase (TraI) of IncPalpha plasmid RP4. J. Biol. Chem. 271, 13068–13076.
The coordinates have been deposited in the Protein Data Bank (accession code 1P4D).
Pansegrau, W., Schroder, W., and Lanka, E. (1994). Concerted action of three distinct domains in the DNA cleaving-joining reaction catalyzed by relaxase (TraI) of conjugative plasmid RP4. J. Biol. Chem. 269, 2782–2789. Peersen, O.B., Ruggles, J.A., and Schultz, S.C. (2002). Dimeric structure of the Oxytricha nova telomere end-binding protein alpha-subunit bound to ssDNA. Nat. Struct. Biol. 9, 182–187. Pommer, A.J., Kuhlmann, U.C., Cooper, A., Hemmings, A.M., Moore, G.R., James, R., and Kleanthous, C. (1999). Homing in on the role of transition metals in the HNH motif of colicin endonucleases. J. Biol. Chem. 274, 27153–27160. Raghunathan, S., Kozlov, A.G., Lohman, T.M., and Waksman, G. (2000). Structure of the DNA binding domain of E. coli SSB bound to ssDNA. Nat. Struct. Biol. 7, 648–652. Reygers, U., Wessel, R., Muller, H., and Hoffmann-Berling, H. (1991). Endonuclease activity of Escherichia coli DNA helicase I directed against the transfer origin of the F factor. EMBO J. 10, 2689–2694. Roussel, A., and Cambillau, C. (1991). The TURBO-FRODO graphics package. In Silicon Graphics Geometry Partners Directory (Mountain View, CA: Silicon Graphics), pp. 81. Sherman, J.A., and Matson, S.W. (1994). Escherichia coli DNA helicase I catalyzes a sequence-specific cleavage/ligation reaction at the F plasmid origin of transfer. J. Biol. Chem. 269, 26220–26226. Shub, D.A., Goodrich-Blair, H., and Eddy, S.R. (1994). Amino acid sequence motif of group I intron endonucleases is conserved in open reading frames of group II introns. Trends Biochem. Sci. 19, 402–404. Stern, J.C., and Schildbach, J.F. (2001). DNA recognition by F Factor TraI36: highly sequence-specific binding of single-stranded DNA. Biochemistry 40, 11586–11595. Street, L.M., Harley, M.J., Stern, J.C., Larkin, C., Williams, S.L., Miller, D.L., Dohm, J.A., Rodgers, M.E., and Schildbach, J.F. (2003). Subdomain organization and catalytic residues of the F factor TraI relaxase domain. Biochim. Biophys. Acta 1646, 86–99. Sui, M.J., Tsai, L.C., Hsia, K.C., Doudeva, L.G., Ku, W.Y., Han, G.W., and Yuan, H.S. (2002). Metal ions and phosphate binding in the H-N-H motif: crystal structures of the nuclease domain of ColE7/ Im7 in complex with a phosphate ion and different divalent metal ions. Protein Sci. 11, 2947–2957. Szebenyi, D.M.E., Arvai, A., Ealick, S., Laluppa, J.M., and Nielsen, C. (1997). A system for integrated collection and analysis of crystallographic diffraction data. J. Synchrotron Radiat. 4, 128–135. Traxler, B.A., and Minkley, E.G., Jr. (1988). Evidence that DNA helicase I and oriT site-specific nicking are both functions of the F TraI protein. J. Mol. Biol. 204, 205–209. Valdar, W.S., and Thornton, J.M. (2001). Conservation helps to identify biologically relevant crystal contacts. J. Mol. Biol. 313, 399–416. Weaver, L.H., Kwon, K., Beckett, D., and Matthews, B.W. (2001). Corepressor-induced organization and assembly of the biotin repressor: a model for allosteric activation of a transcriptional regulator. Proc. Natl. Acad. Sci. USA 98, 6045–6050. Weeks, C.M., and Miller, R. (1999). The design and implementation of SnB version 2.0. J. Appl. Crystallogr. 32, 120–124. Wilson, K.P., Shewchuk, L.M., Brennan, R.G., Otsuka, A.J., and Matthews, B.W. (1992). Escherichia coli biotin holoenzyme synthetase/ bio repressor crystal structure delineates the biotin- and DNA-binding domains. Proc. Natl. Acad. Sci. USA 89, 9257–9261. Zechner, E.L., de la Cruz, F., Eisenbrandt, R., Grahn, A.M., Koraimann, G., Lanka, E., Muth, G., Pansegrau, W., Thomas, C.M., Wilkins, B.M., et al. (2000). Conjugative-DNA transfer processes. In The Horizontal Gene Pool, C.M. Thomas, ed. (Amsterdam: Harwood Academic Publishers), pp. 87–174.