Structure, Vol. 11, 1359–1367, November, 2003, 2003 Elsevier Science Ltd. All rights reserved.
DOI 10.1016/j.str.2003.09.014
Selenomethionine and Technical Advance Selenocysteine Double Labeling Strategy for Crystallographic Phasing Marie-Paule Strub,1 Franc¸ois Hoh,1 Jean-Fre´de´ric Sanchez,1 Jean Marc Strub,2 August Bo¨ck,3 Andre´ Aumelas,1 and Christian Dumas1,* 1 Centre de Biochimie Structurale UMR CNRS 5048 UMR 554 INSERM Universite´ Montpellier I 34090 Montpellier Cedex France 2 Laboratoire de Spectrome´trie de Masse Bio-organique ECPM 25 rue Becquerel 67087 Strasbourg Cedex 2 France 3 Lehrstuhl fu¨r Mikrobiologie der Universita¨t Mu¨nchen D-80638 Munich Germany
Summary A protocol for the quantitative incorporation of both selenomethionine and selenocysteine into recombinant proteins overexpressed in Escherichia coli is described. This methodology is based on the use of a suitable cysteine auxotrophic strain and a minimal medium supplemented with selenium-labeled methionine and cysteine. The proteins chosen for these studies are the cathelin-like motif of protegrin-3 and a nucleoside-diphosphate kinase. Analysis of the purified proteins by electrospray mass spectrometry and X-ray crystallography revealed that both cysteine and methionine residues were isomorphously replaced by selenocysteine and selenomethionine. Moreover, selenocysteines allowed the formation of unstrained and stable diselenide bridges in place of the canonical disulfide bonds. In addition, we showed that NDP kinase contains a selenocysteine adduct on Cys122. This novel selenium double-labeling method is proposed as a general approach to increase the efficiency of the MAD technique used for phase determination in protein crystallography. Introduction The seminal work of Cowie and Cohen (1957) demonstrating the possibility of introducing selenium into proteins was further exploited as a powerful experimental tool for solving the phase problem in crystallography (Hendrickson et al., 1990; Hendrickson, 1991). The everincreasing number of protein crystal structures determined by the multiwavelength anomalous dispersion (MAD) method was made possible primarily by the sim*Correspondence:
[email protected]
plicity of the in vivo selenomethionine (SeMet) incorporation into recombinant proteins. Expression systems concern mainly Escherichia coli but were recently extended to baculovirus (Chen and Bahl, 1991; Bellizzi et al., 1999), Saccharomyces cerevisiae (Bushnell et al., 2001), Pichia pastoris (Larsson et al., 2002; Xu et al., 2002), and mammalian cells (Lustbader et al., 1995; Wu et al., 1994). Various cell-free expression systems also offer a promising alternative for a convenient production of labeled proteins for structural studies (Kigawa et al., 2002). Today, SeMet incorporation and MAD phasing are routinely used and have become the norm in protein crystallography. Such labeling avoids the problematic and lengthy heavy-atom screening procedure. If low incorporation levels and inadequate yields remained the major impediments for a wider use in the early ’90’s, efficient protocols and auxotrophic strains are now available to circumvent these difficulties (Doublie´, 1997). Results from a large number of examples demonstrate that the biological substitution of SeMet for Met residues does not disturb the 3D structure of these labeled proteins. The major problem of the MAD technique is the phasing power of the heavy atoms markers, limited here by the number of ordered, high-occupancy selenium sites relative to the number of protein atoms (Hendrickson, 1999; Boggon and Shapiro, 2000; Ealick, 2000). About one SeMet for every 75–100 amino acids is needed to successfully derive phase estimates (Hendrickson and Ogata, 1997; Hendrickson, 1999). This ratio depends on various factors: the optimal choice of wavelength, the accuracy and the redundancy of the measurements, the crystal quality, and the presence of noncrystallographic symmetry (Tesmer et al., 1996; Dauter and Adamiak, 2001). In some favorable cases, one ordered selenium-labeled methionine for 200 residues has proved to be sufficient for MAD phasing (Rudenko et al., 1999). The single-wavelength anomalous dispersion (SAD) technique relies upon the data collected at the peak of the selenium absorption edge and is also sufficient to obtain electron density of good quality (Rice et al., 2000; Dauter et al., 2002). These phasing approaches are not hindered by the complexity of the selenium substructure, since direct methods allow the automatic location of the heavy atom sites (Deacon and Ealick, 1999). The incorporation of telluromethionines (Budisa et al., 1995, 1997) or modified amino acid seleno-tryptophanes (Bae et al., 2001) into proteins for phase determination in crystallography has been proposed as an alternative to SeMet labeling. Furthermore, following the pioneering work on crambin (Hendrickson and Teeter, 1981), sulfurSAD phasing of native protein crystals using the small but significant anomalous scattering signal from sulfur atoms has been recently revived and produced good electron density maps (Dauter et al., 1999; Dauter, 2002; Lemke et al., 2002; Micossi et al., 2002; Debreczeni et al., 2003). The potential benefit of this original approach using unlabeled proteins is, however, restricted by technical difficulties associated with long wavelength X-rays and the requirement of strongly diffracting crystals (bet-
Structure 1360
ter than ⵑ2 A˚ resolution), highly redundant and accurate measurements, and a high content of Cys-Met residues (4%–5% at least). Natural selenocysteine (SeCys)-containing proteins are synthesized in highly species-specific systems by a complex translation machinery (Bo¨ck et al., 1991; Stadtman, 1996). The overproduction of thioredoxin containing SeCys has been described by Mu¨ller et al. in the E. coli BL21(DE3)cys auxotrophic strain with the genotype selB::kan cysE51 (Mu¨ller et al., 1994). They demonstrated that the high capacity of the bacterial selenoprotein synthesis machinery could be utilized away for heterologous expression of prokaryotic or mammalian selenoproteins (Boschi-Muller et al., 1998; Arner et al., 1999). This strain was used to overexpress the selenocysteinyl derivative of the cathelin-like motif of protegrin-3 ([SeCys]-ProS). Its X-ray structure is the first example of a crystal structure solved by using the MAD technique from a SeCys derivative (Sanchez et al., 2002a). In order to improve the phasing power of selenoproteins, we substitute both SeMet and SeCys residues for Met and Cys. The protocol is simple, achieves a high level of biosynthetic incorporation, and could be applied to any sulfur-containing protein, even those containing disulfide bridges. We have produced pure [SeMet, SeCys]-labeled proteins in sufficient quantities for crystallographic studies. The selenium incorporation has been checked by electrospray mass spectrometry, and the X-ray structures solved by SAD confirmed the expected isomorphism. We anticipate that the double labeling will significantly reinforce the phasing power, thereby extending the applications of the MAD-SAD techniques to solve new structures. Such a double labeling provides an attractive strategy particularly suited for the structural genomics projects that require a convenient high throughput phasing method. Results The feasibility of labeling a recombinant protein with SeCys and its exploitation as a phasing tool to solve its 3D structure by MAD have been recently demonstrated (Sanchez et al., 2002a). To further extend our approach, we selected the same protein, namely the cathelin-like domain of protegrin-3 (ProS), and a nucleoside-diphosphate kinase (NDP kinase) as test cases to establish the feasibility of targeting both Met and Cys residues for an anomalous-scatterer labeling strategy. ProS belongs to a family of precursors of antibiotic peptides involved in the host innate defense mechanisms of several mammalian species, whereas NDP kinase is a ubiquitous enzyme (EC 2.7.4.6) that catalyzes the exchange of phosphate between nucleoside tri- and diphosphates. Overexpression of the Selenium-Labeled Proteins As described in Experimental Procedures, [SeMet, SeCys]-labeled proteins were expressed with appropriate modifications of the protocol optimized for [SeCys]-thioredoxin (Mu¨ller et al., 1994) and successfully used for labeling various SeCys proteins (BoschiMuller et al., 1998; Arner et al., 1999; Sanchez et al.,
2002a). The auxotrophic E. coli host cell BL21(DE3) selB::kan cys51E (Mu¨ller et al., 1994), referred to as BL21(DE3)cys, was selected for our double labeling experiments. In this strain, selB is inactivated by the insertion of a kanamycin resistance cassette, hindering the incorporation of selenium during protein synthesis. CysE—the O-acetyltransferase—is defective, preventing the formation of acetyl-serine and thereby inhibiting the synthesis of cysteine. An additional requirement is that the metabolic pathway that supplies the cells with the methionine amino acid be switched off. Thus, rather than using a double auxotrophic cys⫺/met⫺ strain, our dual labeling strategy relies on the adaptation of the methionine biosynthesis pathway inhibition method (Van Duyne et al., 1993) to the growth of this auxotrophic cys⫺ strain. This method exploits principally the allosteric response of the aspartokinase-homoserine dehydrogenase enzymes that catalyze the first and third steps in the pathway leading to the synthesis of methionine. Accordingly, the host cells are grown to a suitable cell mass in this “double labeling minimal medium” (DLMM) containing a carbon source (glucose or glycerol), vitamins, and growth-limiting amounts of canonical amino acids but high concentrations of amino acids known to inhibit methionine biosynthesis pathway (Van Duyne et al., 1993). The efficiency of the labeling procedure relies on a very low basal expression of cloned target genes during this growing phase. The T7 polymerase-based pET system provides such an efficient and stringently controlled expression of the target protein. Finally, the selenomethionine and selenocysteine amino acids were added as well as rifampicin, which inhibits E. coli RNA polymerase and therefore cellular protein synthesis. A high level of target protein synthesis and labeling is thus obtained upon induction by IPTG in this nongrowing phase. The L-selenocysteine is provided in the medium as DL-selenocystine, since selenocysteine is cytotoxic (Mu¨ller et al., 1994). In the present work, ProS was overexpressed as a N-terminally His-tagged protein (His-Tag-ProS). Whereas the native protein does not contain methionine, its subcloning into a pET15b vector (Sanchez et al., 2002b) generated a methionine in the disordered N-terminal segment. As described (Sanchez et al., 2002a), we took advantage of the presence of the four cysteine residues in ProS to overexpress it in the cysteine auxotrophic strain of E. coli BL21(DE3)cys able to incorporate the L-selenocysteine instead of a cysteine. We show here that ProS and the NDP kinase H122C from Dictyostelium discoideum (Dumas et al., 1992) can be labeled with both SeMet and SeCys. The purification protocols applied to these selenoproteins were similar to that of the corresponding native proteins expressed in regular BL21(DE3) strains, with comparable yields, up to 10–13 mg of pure protein per liter of culture in these two cases.
ESI-MS Analysis of the Double Labeled Proteins The incorporation of selenium atoms in overexpressed ProS was monitored by ESI-MS (Table 1). The spectrum of the HisTag-ProS derivatives clearly indicates that the main compound (about 50%) contains five atoms of
Technical Advance 1361
Table 1. ESI-MS Data for the SeMet-SeCys-Labeled Proteins Proteins [SeMet, SeCys]-NDP kinase H122C mutant HisTag-[SeMet, SeCys]-ProS
Theoretical MWa (Da)
Measured MW (Da)
16,770.1
16,936.8 16,795.6 13,704.0 13,657.2 13,610.2
13,704.1
⫾ ⫾ ⫾ ⫾ ⫾
0.7 0.6 0.2 0.2 0.3
Number of Se Incorporated 3 1 5 4 3
Se ⫹ 1 [SeCys]b [SeCys] Se Se Se
ESI-MS data for the [SeMet, SeCys] derivatives of HisTag-ProS (1 methionine and 4 cysteines) and NDP kinase H122C mutant (2 methionines and 1 cysteine), overexpressed in the cysteine auxotroph E. coli strain BL21(DE3) selB::kan cys51E. Notice that the replacement of a sulfur atom for a selenium increases the molecular weight by 46.9 Da and the addition of a selenocysteine by 167.0 Da. a Calculated with a quantitative incorporation of Se. b [SeCys] is for the selenocysteine adduct.
selenium, corresponding to the four cysteines and the single methionine. Two minor peaks, corresponding to partially selenium-substituted protein species with four (35%) and three (15%) selenium atoms, were also characterized. ESI-MS data of NDP kinase H122C show a major compound (about 70%) of 16,936.8 Da (Table 1), corresponding to the derivative containing two SeMet and a potential diselenide adduct between the protein and free selenocysteine (expected mass increase of 167.0 Da). A second compound of 16,795.6 Da corresponds to the unlabeled NDP kinase with a selenocysteine adduct on the cysteine 122 mediated by a S-Se bridge. These data clearly reveal that the free and reactive Cys122 has undergone a covalent modification by a D- or L-selenocysteine amino acid during biosynthesis in E. coli. Crystallographic Data Further evidence for the effective selenium labeling of both methionine and cysteine residues comes from X-ray data. Our experiments showed that [SeMet, SeCys]-ProS and NDP kinase derivatives were able to crystallize isomorphously with the unlabeled proteins, in the same hexagonal space groups. The unit cell parameters a ⫽ b ⫽ 51.54 A˚, c ⫽ 134.51 A˚, and a ⫽ b ⫽ 76.63 A˚, c ⫽ 106.51 A˚ compare well to those measured for wild-type ProS (a ⫽ b ⫽ 51.36 A˚, c ⫽ 133.68 A˚) and NDP kinase (a ⫽ b ⫽ 74.6 A˚, c ⫽ 105.5 A˚), respectively. The heterogeneity due to the presence of various [SeMet, SeCys] derivatives of ProS and NDP kinase did not hamper the growth of well-diffracting crystals. With four selenium atoms for 105 amino acids in the asymmetric unit, we previously demonstrated that the crystal of [SeCys]-ProS was well suited to obtain the phase parameters by MAD technique, which allowed us to solve the structure de novo (Sanchez et al., 2002a). We show here that the [SeMet, SeCys]-labeled protein crystals of ProS and NDP kinase provide a suitable anomalous dispersive signal for phase calculations by SAD (Table 2). These phases improved by solvent flattening had an average difference of 44.8⬚ (ProS) and 52.4⬚ (NDP kinase) from the final phase sets and gave readily interpretable electron-density maps. Comparison of the final and SAD electron-density maps yielded average correlation coefficients of 0.84 and 0.71 for ProS and NDP kinase, respectively. The good quality of these experimental electron-density maps is illustrated in Figures 1A and 1B.
An anomalous difference Fourier map at 2.7 A˚ resolution was calculated for a selenium-labeled NDP kinase crystal to identify the anomalous scatterers. As displayed in Figure 1C, three peaks are observed near Met residues 80 and 94, as well as Cys122. The former two are associated with the methionyl Se atoms (12.5 and 11.8 peak heights), while the latter is elongated in shape (18.9 peak height) and covers both the Se atom of Cys122 and a 2.4 A˚ distant Se atom of the additional selenocysteine. This feature is characteristic of the formation of a diselenide bridge between Se of Cys122 and selenocysteinyl amino acid present in the medium. Only the selenium atom is clearly visible, the other atoms of the cysteine adduct being disordered and absent from electron-density map (Figure 1A). This is fully compatible with the ESI-MS data (Table 1) that clearly indicate the presence of such an adduct formed with labeled and unlabeled proteins. The estimated high occupancy of this selenium atom in the adduct confirms that these stable diselenide or sulfo-selenide forms are present in more than 90% of the NDP kinase samples. A low but significant anomalous peak (4.2 peak height) is likely to represent an alternate rotamer of SeCys122. As shown in Figure 1D, the anomalous difference Fourier map for ProS reveals two elongated peaks (about 17 peak height) coinciding with the 85–96 and 107–124 cystines. The corresponding four peaks are clearly individualized above 12. The occupancy of all selenium atoms is estimated greater than 0.9 and confirms the high level of SeCys incorporation. Although the incorporation of a SeMet was clearly demonstrated by ESI-MS, the N-terminal SeMet is disordered and therefore does not contribute to anomalous scattering. Sulfur-Selenium Isomorphism Furthermore, the determination of the X-ray structures of these two labeled proteins allowed a detailed comparison with their native counterparts and analysis of the characteristics of the models in the regions of SeMet and SeCys residues. The strong similarity of the X-ray structure of the native ProS (Sanchez et al., 2002a) with that of its [SeMet, SeCys] derivative (overall rmsd of 0.41 A˚ for the 84 C␣ atoms) confirms their isomorphism at the atomic level (Figures 2A and 2B). There is little change in the global structure of [SeMet, SeCys]-labeled NDP kinase when compared with the native form (overall rmsd of 0.40 A˚ for the 148 C␣ atoms). The most pronounced changes around the SeCys122 in NDP kinase
Structure 1362
Table 2. Summary of Crystallographic Data Statistics Data Sets
NDP Kinase
ProS
Space group Unit cell parameters (A˚) Wavelength (A˚) Number of measurements Number of unique reflections Resolution range (A˚) Completeness (%) Average intensity I/(I) Rmergeb
P6322 a ⫽ 76.63, c ⫽ 106.51 0.9754 114,811 5,418 25.0–2.7 (2.74–2.70)a 98.4 (97.4) 18.0 (3.1) 0.074 (0.276)
P6522 a ⫽ 51.54, c ⫽ 134.51 0.9799 48,282 5,162 23.0–2.3 (2.36–2.30) 99.7 (98.6) 13.1 (2.9) 0.078 (0.295)
Rcullis/Rkraut (acentric reflections) Anomalous phasing power Figure of merit
0.82/0.033 1.42 0.30
0.68/0.044 2.20 0.37
Number of protein atoms Number of waters Number of reflections (working/free) Rwork/Rfree (%)c Overall B factor (A˚2) Rmsd from ideal geometry Bond lengths (A˚) Bond angles (⬚) Ramachandran plot (%) Most favored region Additional allowed Disallowed region
1127 15 4823/595 18.2/23.2 37.2
686 35 4503/659 21.9/28.3 29.5
0.015 1.544
0.016 1.560
93.4 5.8 0.8
91.5 8.5 0
a
Numbers in parentheses refer to reflections in the outer resolution shell. Rmerge ⫽ ⌺h | Ii (h) ⫺ ⬍I (h)⬎| / ⌺ I (h), where ⬍I(h)⬎ is the average intensity of equivalent reflections Ii (h) and the sum is extended over all measured observations for all unique reflections. c Rwork ⫽ ⌺ | Foh,k,l ⫺ Fch,k,l | / ⌺ | Foh,k,l |, where Foh,k,l and Fch,k,l are the observed and calculated structure factors amplitudes. b
is the movement of the backbone atoms (about 0.5 A˚ for C␣122) and a small displacement (0.6 A˚) of the side chain of His55 (Figure 2C). These localized and limited alterations in the active site of the enzyme probably result from steric hindrances arising from the formation of the SeCys adduct on C122. The engineered four SeCys substitutions in ProS lead to the formation of the expected intramolecular diselenide bridges Se85-Se96 and Se107-Se124. In wild-type ProS, the C␣85-C␣96 distance was 5.9 A˚. This value increases moderately to 6.2 A˚ for labeled ProS (Figure 2B). A similar increase (from 3.8 to 4.0 A˚) is also observed between C␣107 and C␣124 for the cystine linkage that connects the two antiparallel  strands (Figure 2A). In agreement with these observations, analysis of the stereochemical parameters of the diselenides indicates an absence of significant steric strain on the cystine bridge. The values for the dihedral C-S-S-C and C-Se-Se-C angles of the unlabeled and labeled ProS are very similar: 97.9⬚ and 102.9⬚ for C107-C124; ⫺93.0⬚ and ⫺91.9⬚ for C85-C96, respectively. As illustrated in Figures 2A and 2B, the geometrical distortion effects (Se-Se distance of 2.4 A˚ instead of 2.0 A˚ for S-S) are tolerated without noticeable alterations of the environment. The replacement of a sulfur (van der Waals radius 1.8 A˚) for a selenium atom (van der Waals radius 1.95 A˚) does not induce significant distortion either in the diselenide bridge or in the polypeptide fold. These results suggest the highly isomorphous character of the sulfur-selenium replacement in cysteine residues as previously demonstrated in the case of methionine.
Statistical Survey of Cys-Met Occurrences in Sequence Databases The frequency of sulfur-containing residues in proteins is relatively low at 2.4% and 1.9% for Met and Cys, respectively. This needs to be taken into account when considering whether this double labeling strategy may be of use. It is for this reason that we have re-analyzed here the relative abundance of Met and Cys residues in protein sequences. The statistical data were obtained using 86,525 protein sequences included in the SwissProt database (release 39.0) (Bairoch and Apweiler, 2000) and 15,880 structure entries in the Protein Data Bank (Berman et al., 2000). The N-terminal Met residue in the Swiss-Prot sequences was systematically omitted and, in the survey of the Protein Data Bank sequences, the content of Met and Cys residues only present in well-ordered parts of the models were analyzed. Such an approach allows us to take into account the real fraction of these residues potentially contributing to the phasing power (Boggon and Shapiro, 2000). The graphs for cumulated percentage content of each class of residues in these databases are presented in Figure 3. A reasonable threshold for a successful MAD experiment requires about 1%–1.3% of sulfur containing amino acid (Hendrickson, 1999; Boggon and Shapiro, 2000). Thus, the number of selenoproteins for which structures could be solved by this technique includes 60%–80% or 40%– 55% of the proteins, based solely on their methionine or cysteine content, respectively. The Cys and Met double labeling increases the number of proteins whose structure could be potentially determined by MAD methodology to 88%–93% (Figure 3). The phasing power could be
Technical Advance 1363
Figure 1. The Experimental Electron Density Maps from SAD Phasing after Solvent Flattening Are Displayed in Blue (A) The electron density is superimposed on the refined model of the [SeMet, SeCys]-labeled NDP kinase and contoured at 1.1. For clarity, the segment including SeMet80 was omitted. (B) The electron density around the C85-C96 diselenide bridge is superimposed on the refined model of the [SeMet, SeCys]-labeled ProS. The selenium atoms are colored in green. The protein atoms are shown in red, blue, and yellow for oxygen, nitrogen, and carbon, respectively. (C) Anomalous difference Fourier map of the [SeMet, SeCys]-labeled NDP kinase, superimposed with the C␣ trace of the refined model (blue). The map is contoured at 4 (red) and 9 (magenta) surrounding a protomer of the NDP kinase. SeMet80, SeMet94, and SeCys122 are displayed as ball-and-stick side chain models (yellow-green color). (D) Side view of the C␣ trace of the refined [SeMet, SeCys]-ProS with superimposed anomalous difference Fourier map contoured at 4 (red) and 10 (cyan). The SeCys residues 85–96 and 107–124 involved in two diselenide bridges are displayed in yellow and green.
further increased if we take into account the formation of a selenocysteine adduct on each accessible cysteine residue. Thus, keeping in mind this simple means of selenium labeling both on cysteine and methionine, the vast majority of proteins would be amenable to crystal structure determination by MAD. Discussion In this study, we showed that the BL21 (DE3)cys strain initially employed for the selenium labeling of cysteines can also efficiently be used for high-level biosynthetic incorporation of both SeMet and SeCys in proteins. ESIMS mass spectrometry was used to monitor the incorporation of SeMet and SeCys in NDP kinase and ProS. These labeled proteins have been crystallized and diffraction data were collected. A set of accurate phases has been obtained using the SAD method and both
structures show full isomorphism with native proteins. These labeling experiments provide direct evidence of isostructuralism of Cys and SeCys proteins. This property is also extended in the case of the formation of diselenide bridges involved in ProS protein as well as in synthetic peptides (Pegoraro et al., 1998, 1999). These findings strongly support that the complete replacement of sulfur for selenium in proteins produces folded and fully isomorphous molecules able to crystallize as native or SeMet-labeled proteins. In our experiments, we note the quantitative formation of a selenocysteine adduct. This exploits the highly reactive selenate anion and could be extended to any protein containing accessible free cysteines. The insertion of this additional selenium atom enhanced the phasing power of the labeled protein crystal. If necessary, it could be reinforced during the final purification steps and the crystallization experiments by addition of
Structure 1364
Figure 3. Statistical Survey of Cysteine and Methionine in Protein Sequence Databases Graphs showing the cumulated distribution of the number of protein sequences as a function of the percentage of cysteine (short dashed line), methionine (long dashed line), and sulfur-containing residues (solid line) from all protein structures (A) in the PDB, solved by X-ray crystallography, and (B) from all protein sequences in the Swissprot database.
Figure 2. Sulfur-Selenium Isomorphism in ProS and NDP Kinase Structures (A and B) Comparison of 3D structures of native ProS (gray) (PDB entry 1KWI) and [SeMet, SeCys]-ProS derivative in the surroundings of the C107-C124 (A) and C85-C96 (B) disulfide bridges. The diselenide bonds are colored in yellow and N, C, O atoms are colored in blue, green, and red, respectively. (C) Comparison of the X-ray structure of [SetMet, SeCys]-NDP kinase H122C mutant with the unlabeled one (PDB entry 1NDK) in the vicinity of the SeCys122 residue. The unlabeled protein is colored in gray.
selenocysteine. As potent nucleophiles, both the thiolate and selenate anions could also react with mercuric complexes or organomercurials, the typical heavy atom compounds used in protein crystallography. The oxi-
dized forms of Se have sharper K-edge features and f’,f’’ values of greater amplitude than does the reduced form (Smith and Thompson, 1998; Thomazeau et al., 2001). Thus, an optimal data collection for MAD and SAD requires a precise determination of the absorption spectrum on a dedicated synchrotron beam line to increase the phasing power of these anomalous scatterers. Moreover, a statistical analysis of sulfur-containing proteins was carried out to highlight the potentialities of this [SeMet, SeCys] double labeling scheme in crystallography, and we propose it as a general approach for phasing almost all protein crystals. Functional Implications Beyond a purely structural role, the vast implication of the cysteine residue in functional processes could now be targeted efficiently. This concerns principally the consequences of selenocysteine substitution in the catalytic activity of enzymes, electron transfer reactions involving various iron-sulfur clusters (nitrogenases), and DNA binding (the classic Cys-His zinc finger motifs). Because of the reducing environment in bacteria, heterologous proteins rich in cystine frequently do not fold properly. Secretion in the oxidizing environment of periplasmic space that contains enzymes catalyzing the formation and rearrangement of disulfide bonds offers an efficient alternative, as well as in vitro oxidative re-
Technical Advance 1365
folding chromatography with immobilized chaperones (Altamirano et al., 1999). For these proteins, the selenocysteine replacement could facilitate their folding. Indeed, the oxidation of the selenium form of cysteine is considerably faster than that of its sulfur derivative (Besse et al., 1997). Even if, in some cases, this high reactivity could generate nonnative diselenide bridges as reported for the synthesis of apamin and endothelin analogs peptides, the selenocysteine approach has been proven to be an efficient chemical tool for the induction of correct oxidative folding of multiple cysteine-containing peptides (Pegoraro et al., 1998). Therefore, it might be expected to assist the proper folding of recombinant proteins in E. coli cytoplasm by efficiently trapping early seleno analogs of disulfide-bonded intermediates. In conclusion, we report here the feasibility of the selenium double labeling technique: two selenomethionine-selenocysteine derivatized proteins were expressed in E. coli, purified, and crystallized. This method opens the way for solving new structures without the requirement of tedious approaches using heavy atom derivatives, frequently needed to complement insufficient SeMet labeling. The simple and efficient selenium labeling approach applied to all sulfur atoms appears to be a very helpful phasing vehicle for protein X-ray crystallography using MAD-SAD experiments. Its valuable potential to speed up 3D structure determination by improving the quality of the electron-density maps and facilitating their interpretation promises to significantly contribute to high throughput structural genomics. Experimental Procedures Cloning of ProS and NDP Kinase H122C The gene coding for the protegrin-3 cathelin-like motif (ProS) from Sus scrofa was inserted into the NdeI-BamHI sites of a pET15b vector (Novagen) as previously described (Sanchez et al., 2002b). A thrombin cleavage site was used to remove the HisTag sequence. ProS comprises residues 30 to 130 and contains, at its N terminus, an additional 4 residue insert (GSHM) including the sole methionine in the sequence. Cloning of the gene coding for the NDP kinase H122C from Dictyostelium discoideum has been reported elsewhere (Dumas et al., 1992). Expression of the [SeMet, SeCys]-Labeled ProS The auxotrophic E. coli host cell BL21(DE3) selB::kan cys51E (Mu¨ller et al., 1994) was transformed with the pET15b/ProS plasmid. Double labeling was carried out on a 1 l scale in Erlen-Meyer flasks. In our experiments, the synthetic medium called “DLMM” used for cell growth and expression contained 10 g l⫺1 potassium monohydrogen phosphate, 2 g l⫺1 ammonium chloride, 1 g l⫺1 sodium acetate, 2.75 g l⫺1 sodium succinate, 0.435 g l⫺1 magnesium acetate, 0.01 g l⫺1 calcium chloride, 1.6 g l⫺1 serine, 1.0 g l⫺1 leucine, 0.4 g l⫺1 alanine, glutamic acid, glutamine, arginine, and glycine, 0.25 g l⫺1 aspartic acid, 0.1 g l⫺1 asparagine, histidine, lysine, proline, threonine, tyrosine, and isoleucine, 0.05 g l⫺1 tryptophane, valine, and phenylalanine, 10 g l⫺1 glycerol, 40 nM boric acid, 3 nM cobalt chloride, 0.1 nM copper chloride, 8 nM manganese chloride, 1 nM zinc chloride, 100 mg l⫺1 nicotinamide, 50 mg l⫺1 thiamine, 0.5 mg l⫺1 biotin, 100 mg l⫺1 carbenicillin, and 15 mg l⫺1 kanamycin. 200 ml of overnight culture of BL21(DE3)cys/ProS in LB containing 100 mg l⫺1 carbenicillin, 15 mg l⫺1 kanamycin, and 50 mg l⫺1 cysteine were centrifuged at 5000 ⫻ g for 15 min at 4⬚C. The pellet was collected, washed twice with a solution of 10 g l⫺1 potassium monohydrogen phosphate and 1 g l⫺1 sodium acetate and resuspended in 1 l of DLMM medium supplemented with 50 mg l⫺1 cysteine and
50 mg l⫺1 methionine. The culture was grown at 37⬚C for 6 hr, the expression was induced by addition of 1 mM IPTG and, after 10 min, chloramphenicol was added to a final concentration of 10 mg l⫺1. After 5 min incubation, the cells were harvested under sterile conditions and centrifuged at 5000 ⫻ g for 15 min. The pellets were washed twice with a solution of 10 g l⫺1 potassium monohydrogen phosphate and 1 g l⫺1 sodium acetate and resuspended in 1 l of DLMM medium supplemented with 600 M of DL-selenocystine (Sigma-Aldrich), 425 M of DL-selenomethionine (Sigma-Aldrich), 1 mM of IPTG, and 400 mg l⫺1 rifampicin. About 5–6 g of cells were collected after overnight incubation at 37⬚C and stored at ⫺80⬚C. Expression of the [SeMet, SeCys]-Labeled NDP Kinase The auxotrophic E. coli host cell BL21(DE3) selB::kan cys51E was transformed with the pTZ18/NDP kinase H122C plasmid. Both the cell growth and overexpression steps were performed as described above for [SeMet, SeCys]-ProS with a few modifications to optimize the expression. The bacteria were allowed to grow for 3–4 hr before induction by IPTG. The carbon source used in the culture media was glucose instead of glycerol and no rifampicin was added during expression. About 5 g of a deep-orange cell pellet was collected and stored at ⫺80⬚C. Purification and Crystallization of Labeled Proteins The [SeMet, SeCys]-ProS was extracted, purified, and crystallized as described previously for the native and [SeCys] proteins (Sanchez et al., 2002b). Briefly, the HisTag-labeled protein was purified by chromatography on a Ni-NTA Superflow affinity column (Qiagen) and analyzed by ESI-MS. The N-terminal HisTag was removed by thrombin cleavage and ProS further purified by gel filtration chromatography using a Superdex-75 column (Amersham Biosciences). Crystals were grown by vapor phase diffusion in hanging drops containing an equal volume of Se-labeled protein stock (10–15 mg/ ml) and precipitant mother liquor consisting of ammonium sulfate 2.1 M and sodium acetate buffer (pH 3.8). Crystals of 0.25 ⫻ 0.1 ⫻ 0.1 mm3 size grew within a week. The [SeMet, SeCys]-NDP kinase H122C was purified by two consecutive column chromatographies (Dumas et al., 1992): first with an ion exchange column (Q-Sepharose FF, Amersham Biosciences) to eliminate most of the DNA and the contaminant proteins, and then a Blue-Sepharose 6FF affinity column (Amersham Biosciences). After elution with a linear NaCl gradient (0–700 mM), the protein was dialyzed and concentrated to 10 mg/ml in 20 mM TrisHCl buffer (pH 7.5). Crystals were grown by vapor phase diffusion in hanging drops, in a manner similar to that reported previously (Morera et al., 1994). Protein drops were equilibrated against reservoirs containing 10% w/v PEG6000 (Fluka), 50 mM Tris-HCl buffer (pH 7.5), and 10 mM MgCl2. Crystals appeared within 1 week at 18⬚C. Mass Spectrometry The overexpressed selenoproteins were analyzed by ESI-MS on a VG Bio-Q quadrupole with a mass range of 4000 Da (Bio-Tech, Manchester, UK) in positive mode. ProS and NDP kinase proteins were desalted on Zip-Tip (Millipore). About 10 pmol were used for mass analysis. Scanning was performed from m/z ⫽ 500 Da to m/z ⫽ 1700 Da in 10 s. Calibration was performed using the multiply charged ions produced by a separate introduction of horse heart myoglobin (16,951.4 Da). Structure Determination and Refinement X-ray data for [SeMet, SeCys]-ProS to 2.3 A˚ resolution were collected at wavelength 0.9799 A˚ (peak of the fluorescence spectrum) on beamline BM30A (ESRF, France) (Roth et al., 2002) using a crystal frozen at 100⬚K in mother liquor supplemented with 27%–30% glycerol. The diffraction data from a single crystal of [SeMet, SeCys]NDP kinase were collected at 281⬚K on the W32 beam line at the LURE-DCI synchrotron (Orsay, France) with a nonoptimized choice of the wavelength. Data processing and reduction were performed with DENZO-SCALEPACK programs (Otwinowski and Minor, 1997). A summary of these SAD data collection statistics is shown in Table 1. In both crystal data, the anomalous Patterson Harker sections clearly showed peaks corresponding to self-vectors for selenium
Structure 1366
atoms. The program SHARP (de La Fortelle and Bricogne, 1997) was used to locate and refine selenium sites before conducting SAD phasing. After solvent flattening (Cowtan and Zhang, 1999), the SAD electron-density maps allowed us to position easily the corresponding atomic models of unlabeled proteins (PDB entries 1NDK [Dumas et al., 1992] and 1KWI [Sanchez et al., 2002a] for NDP kinase and cathelin-like motif of ProS, respectively). After a few cycles of rigid body and restrained refinement with the program REFMAC (Murshudov et al., 1999), these initial models were used to redetermine the Se positions in the diselenide bridges, initially considered as “superselenium” atoms. A new experimental SAD phasing was carried out on each dataset (Table 2). Anomalous difference Fourier maps were calculated with SAD phases idealized by solvent flattening as provided in the CCP4 suite (CCP4, 1994). Further refinement of these structures was carried out with the program REFMAC. Bulk solvent and overall anisotropic temperature factor corrections were used throughout refinement. The final [SeMet, SeCys] derivatives structures were refined to an R and Rfree of 18.2% and 23.2% for NDP kinase and 21.9% and 28.3% for ProS. The NDP kinase includes residues 8–155 and 15 water molecules and ProS protein residues 31–62, 69–114, and 120–128 with 35 water molecules. An analysis of the stereochemistry with PROCHECK (Laskowski et al., 1993) showed that all parameters were good. All nonglycine residues were localized in the favored and allowed regions of the Ramachandran diagram except Ile120, which adopts an unusual main chain conformation in all NDP kinases structures. Final refinement statistics are summarized in Table 2. Acknowledgments We thank Philippe Carpentier for his assistance during data collection on beamline BM30A (ESRF, Grenoble), the staff of beamline W32 (LURE, Orsay), Ioan Lascu and Tudor Borza for providing the clones of NDP kinase H122C, and Je´rome Grassy for his help for the statistical analyzis of PDB and SwissProt databases. The work was supported by grants from MENRT and FRM (J.-F.S.). Received: May 28, 2003 Accepted: August 1, 2003 Published: November 4, 2003
Boggon, T.J., and Shapiro, L. (2000). Screening for phasing atoms in protein crystallography. Struct. Fold. Des. 8, R143–R149. Boschi-Muller, S., Muller, S., Van Dorsselaer, A., Bo¨ck, A., and Branlant, G. (1998). Substituting selenocysteine for active site cysteine 149 of phosphorylating glyceraldehyde 3-phosphate dehydrogenase reveals a peroxidase activity. FEBS Lett. 439, 241–245. Budisa, N., Steipe, B., Demange, P., Eckerskorn, C., Kellermann, J., and Huber, R. (1995). High-level biosynthetic substitution of methionine in proteins by its analogs 2-aminohexanoic acid, selenomethionine, telluromethionine and ethionine in Escherichia coli. Eur. J. Biochem. 230, 788–796. Budisa, N., Karnbrock, W., Steinbacher, S., Humm, A., Prade, L., Neuefeind, T., Moroder, L., and Huber, R. (1997). Bioincorporation of telluromethionine into proteins: a promising new approach for X-ray structure analysis of proteins. J. Mol. Biol. 270, 616–623. Bushnell, D.A., Cramer, P., and Kornberg, R.D. (2001). Selenomethionine incorporation in Saccharomyces cerevisiae RNA polymerase II. Structure 9, R11–R14. CCP4 (Collaborative Computational Project 4) (1994). The CCP4 suite: programs for protein crystallography, Acta Crystallogr. D 50, 760–763. Chen, W., and Bahl, O.P. (1991). Recombinant carbohydrate and selenomethionyl variants of human choriogonadotropin. J. Biol. Chem. 266, 8192–8197. Cowie, D.B., and Cohen, G.N. (1957). Biosynthesis by E. coli of active proteins containing selenium instead of sulphur. Biochim. Biophys. Acta 26, 252–261. Cowtan, K.D., and Zhang, K.Y. (1999). Density modification for macromolecular phase improvement. Prog. Biophys. Mol. Biol. 72, 245–270. Dauter, Z. (2002). New approaches to high-throughput phasing. Curr. Opin. Struct. Biol. 12, 674–678. Dauter, Z., and Adamiak, D.A. (2001). Anomalous signal of phosphorus used for phasing DNA oligomer: importance of data redundancy. Acta Crystallogr. D 57, 990–995. Dauter, Z., Dauter, M., de La Fortelle, E., Bricogne, G., and Sheldrick, G.M. (1999). Can anomalous signal of sulfur become a tool for solving protein crystal structures? J. Mol. Biol. 289, 83–92.
References
Dauter, Z., Dauter, M., and Dodson, E. (2002). Jolly SAD. Acta Crystallogr. D 58, 494–506.
Altamirano, M.M., Garcia, C., Possani, L.D., and Fersht, A.R. (1999). Oxidative refolding chromatography: folding of the scorpion toxin Cn5. Nat. Biotechnol. 17, 187–191.
de La Fortelle, E., and Bricogne, G. (1997). Maximum-likelihood heavy atom parameter refinement for multiple isomorphous replacement and multiwavelength anomalous diffraction methods. Methods Enzymol. 276, 472–494.
Arner, E.S., Sarioglu, H., Lottspeich, F., Holmgren, A., and Bo¨ck, A. (1999). High-level expression in Escherichia coli of selenocysteinecontaining rat thioredoxin reductase utilizing gene fusions with engineered bacterial-type SECIS elements and co-expression with the selA, selB and selC genes. J. Mol. Biol. 292, 1003–1016. Bae, J.H., Alefelder, S., Kaiser, J.T., Friedrich, R., Moroder, L., Huber, R., and Budisa, N. (2001). Incorporation of beta-selenolo[3,2-b]pyrrolyl-alanine into proteins for phase determination in protein X-ray crystallography. J. Mol. Biol. 309, 925–936. Bairoch, A., and Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48. Bellizzi, J.J., Widom, J., Kemp, C.W., and Clardy, J. (1999). Producing selenomethionine-labeled proteins with a baculovirus expression vector system. Struct. Fold. Des. 7, R263–R267.
Deacon, A.M., and Ealick, S.E. (1999). Selenium-based MAD phasing: setting the sites on larger structures. Struct. Fold. Des. 7, R161– R166. Debreczeni, J.E., Bunkoczi, G., Ma, Q., Blaser, H., and Sheldrick, G.M. (2003). In-house measurement of the sulfur anomalous signal and its use for phasing. Acta Crystallogr. D 59, 688–696. Doublie´, S. (1997). Preparation of selenomethionyl proteins for phase determination. Methods Enzymol. 276, 523–530. Dumas, C., Lascu, I., Morera, S., Glaser, P., Fourme, R., Wallet, V., Lacombe, M.L., Ve´ron, M., and Janin, J. (1992). X-ray structure of nucleoside diphosphate kinase. EMBO J. 11, 3203–3208. Ealick, S.E. (2000). Advances in multiple wavelength anomalous diffraction crystallography. Curr. Opin. Chem. Biol. 4, 495–499.
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. (2000). The protein data bank. Nucleic Acids Res. 28, 235–242.
Hendrickson, W.A. (1991). Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science 254, 51–58.
Besse, D., Budisa, N., Karnbrock, W., Minks, C., Musiol, H.J., Pegoraro, S., Siedler, F., Weyher, E., and Moroder, L. (1997). Chalcogenanalogs of amino acids. Their use in X-ray crystallographic and folding studies of peptides and proteins. Biol. Chem. 378, 211–218.
Hendrickson, W. (1999). Maturation of MAD phasing for the determination of macromolecular structures. J. Synchroton. Radiat. 6, 845–851.
Bo¨ck, A., Forchhammer, K., Heider, J., and Baron, C. (1991). Selenoprotein synthesis: an expansion of the genetic code. Trends Biochem. Sci. 16, 463–467.
Hendrickson, W.A., and Ogata, C.M. (1997). Phase determination from multiwavelength anomalous diffraction measurements. Methods Enzymol. 276, 494–523. Hendrickson, W.A., and Teeter, M.M. (1981). Structure of the hy-
Technical Advance 1367
drophobic protein crambin determined directly from the anomalous scattering of sulfur. Nature 290, 107–113.
(2002b). Overexpression and structural study of the cathelicidin motif of the protegrin-3 precursor. Biochemistry 41, 21–30.
Hendrickson, W.A., Horton, J.R., and LeMaster, D.M. (1990). Selenomethionyl proteins produced for analysis by multiwavelength anomalous diffraction (MAD): a vehicle for direct determination of threedimensional structure. EMBO J. 9, 1665–1672.
Smith, J.L., and Thompson, A. (1998). Reactivity of selenomethionine-dents in the magic bullet? Structure 6, 815–819. Stadtman, T.C. (1996). Selenocysteine. Annu. Rev. Biochem. 65, 83–100.
Kigawa, T., Yamaguchi-Nunokawa, E., Kodama, K., Matsuda, T., Yabuki, T., Matsuda, N., Ishitani, R., Nureki, O., and Yokoyama, S. (2002). Selenomethionine incorporation into a protein by cell-free synthesis. J. Struct. Funct. Genomics 2, 29–35.
Tesmer, J.J., Klem, T.J., Deras, M.L., Davisson, V.J., and Smith, J.L. (1996). The crystal structure of GMP synthetase reveals a novel catalytic triad and is a structural paradigm for two enzyme families. Nat. Struct. Biol. 3, 74–86.
Larsson, A.M., Stahlberg, J., and Jones, T.A. (2002). Preparation and crystallization of selenomethionyl dextranase from Penicillium minioluteum expressed in Pichia pastoris. Acta Crystallogr. D 58, 346–348.
Thomazeau, K., Curien, G., Thompson, A., Dumas, R., and Biou, V. (2001). MAD on threonine synthase: the phasing power of oxidized selenomethionine. Acta Crystallogr. D 57, 1337–1340.
Laskowski, R.A., MacArthur, M.W., Moss, D.S., and Thornton, J.M. (1993). PROCHECK: a program to check the stereochemistry quality of protein structures. J. Appl. Crystallogr. 26, 283–291. Lemke, C.T., Smith, G.D., and Howell, P.L. (2002). S-SAD, Se-SAD and S/Se-SIRAS using Cu Kalpha radiation: why wait for synchrotron time? Acta Crystallogr. D 58, 2096–2101. Lustbader, J.W., Wu, H., Birken, S., Pollak, S., Gawinowicz Kolks, M.A., Pound, A.M., Austen, D., Hendrickson, W.A., and Canfield, R.E. (1995). The expression, characterization, and crystallization of wild-type and selenomethionyl human chorionic gonadotropin. Endocrinology 136, 640–650. Micossi, E., Hunter, W.N., and Leonard, G.A. (2002). De novo phasing of two crystal forms of tryparedoxin II using the anomalous scattering from S atoms: a combination of small signal and medium resolution reveals this to be a general tool for solving protein crystal structures. Acta Crystallogr. D 58, 21–28. Morera, S., Lascu, I., Dumas, C., LeBras, G., Briozzo, P., Veron, M., and Janin, J. (1994). Adenosine 5⬘-diphosphate binding and the active site of nucleoside diphosphate kinase. Biochemistry 33, 459–467. Mu¨ller, S., Senn, H., Gsell, B., Vetter, W., Baron, C., and Bo¨ck, A. (1994). The formation of diselenide bridges in proteins by incorporation of selenocysteine residues: biosynthesis and characterization of (Se)2-thioredoxin. Biochemistry 33, 3404–3412. Murshudov, G.N., Vagin, A.A., Lebedev, A., Wilson, K.S., and Dodson, E.J. (1999). Efficient anisotropic refinement of macromolecular structures using FFT. Acta Crystallogr. D 55, 247–255. Otwinowski, Z., and Minor, W. (1997). Processing of x-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–325. Pegoraro, S., Fiori, S., Rudolph-Bohner, S., Watanabe, T.X., and Moroder, L. (1998). Isomorphous replacement of cystine with selenocystine in endothelin: oxidative refolding, biological and conformational properties of [Sec3,Sec11,Nle7]-endothelin-1. J. Mol. Biol. 284, 779–792. Pegoraro, S., Fiori, S., Cramer, J., Rudolph-Bohner, S., and Moroder, L. (1999). The disulfide-coupled folding pathway of apamin as derived from diselenide-quenched analogs and intermediates. Protein Sci. 8, 1605–1613. Rice, L.M., Earnest, T.N., and Brunger, A.T. (2000). Single-wavelength anomalous diffraction phasing revisited. Acta Crystallogr. D 56, 1413–1420. Roth, M., Carpentier, P., Kaikati, O., Joly, J., Charrault, P., Pirocchi, M., Kahn, R., Fanchon, E., Jacquamet, L., Borel, F., et al. (2002). FIP: a highly automated beamline for multiwavelength anomalous diffraction experiments. Acta Crystallogr. D 58, 805–814. Rudenko, G., Nguyen, T., Chelliah, Y., Sudhof, T.C., and Deisenhofer, J. (1999). The structure of the ligand-binding domain of neurexin Ibeta: regulation of LNS domain function by alternative splicing. Cell 99, 93–101. Sanchez, J.F., Hoh, F., Strub, M.P., Aumelas, A., and Dumas, C. (2002a). Structure of the cathelicidin motif of protegrin-3 precursor: structural insights into the activation mechanism of an antimicrobial protein. Structure 10, 1363–1370. Sanchez, J.F., Wojcik, F., Yang, Y.S., Strub, M.P., Strub, J.M., Van Dorsselaer, A., Martin, M., Lehrer, R., Ganz, T., Chavanieu, A., et al.
Van Duyne, G.D., Standaert, R.F., Karplus, P.A., Schreiber, S.L., and Clardy, J. (1993). Atomic structures of the human immunophilin FKBP-12 complexes with FK506 and rapamycin. J. Mol. Biol. 229, 105–124. Wu, H., Lustbader, J.W., Liu, Y., Canfield, R.E., and Hendrickson, W.A. (1994). Structure of human chorionic gonadotropin at 2.6 A˚ resolution from MAD analysis of the selenomethionyl protein. Structure 2, 545–558. Xu, B., Munoz, I.I., Janson, J.C., and Stahlberg, J. (2002). Crystallization and X-ray analysis of native and selenomethionyl beta-mannanase Man5A from blue mussel, Mytilus edulis, expressed in Pichia pastoris. Acta Crystallogr. D 58, 542–545. Accession Numbers The coordinates and structure factors have been deposited in the Protein Data Bank with the entry codes 1PAE and 1PFP for [SeMet, SeCys] derivatives of NDP kinase and ProS, respectively. The E. coli BL21(DE3)cys auxotrophic strain is available from the authors upon request (
[email protected] or
[email protected]).