Methods 46 (2008) 54–61
Contents lists available at ScienceDirect
Methods j o u r n a l h o m e p a g e : w w w . e l s e v i e r. c o m / l o c a t e / y m e t h
Mass spectrometry of full-length integral membrane proteins to define functionally relevant structural features Guillaume Gabant, Martine Cadene * Centre de Biophysique Moléculaire, CNRS UPR4301, Rue Charles Sadron, 45071 Orleans cedex 2, France
a r t i c l e
i n f o
Article history: Accepted 20 October 2008 Available online 29 October 2008 Keywords: Mass spectrometry Integral membrane protein Structure–function relationship Limited proteolysis Protocol
a b s t r a c t The crystallization and structure determination of integral membrane proteins remains a dif ficult task relying on a good understanding of the behavior of the protein for success. To date, membrane protein structures are still far outnumbered by solub le protein structures. Mass spectrometry is a powerful and versatile tool offering deep insights into the state of the integral membrane protein the structuralist intends to crystallize. With appropriate sample preparation methods, it provides information that can sometimes prove critical at various stages of the structure determination process, from protein expres sion to model building. Moreover, valuable knowledge is gained when the identified structural features underlie important functional aspects. Electrospray and matrix assisted laser desorption ionization (MALDI) methods, however, face a particular challenge when dealing with integral membrane proteins. A MALDI method specifically optimized for membrane protein analysis is presented here, with detailed information on the sample preparation and deposition, as well as guidelines for domain determination by limited proteolysis. MALDI-time of flight mass spectrometry can be used to do a proper inventory of initiation sites, to tailor a protein to a stable, well-folded form, and to evaluate selenomethionine replace ment. These approaches are illustrated with a few examples drawn from the structural biology of ion channels. © 2008 Elsevier Inc. All rights reserved.
1. Introduction When Roderick MacKinnon and his team undertook the crystal lization and 3D-structure determination of the potassium channel circa 1996, there were coordinates for only six integral membrane proteins in the protein data bank (PDB). With so few precedents for success, the task was considered daunting at the time. The work led to the crystal structure of KcsA [1], a world-premiere for ion channels in general and a landmark for the understanding of the function mechanism of potassium channels in particular. To date, the structures of about 90 integral membrane proteins have been determined. While these numbers show that confid ence in achiev ing this type of result is growing, it is still a relatively small number in comparison to soluble protein structures. Yet, membrane pro teins play a very important role in cells and account for about 50% of therap eutic targets. Structural information on integral mem brane proteins is thus as desirable as ever. Here, we describe some tools that may be used in mass spectrometry (MS) in combination with biochemical manipulations to assist in getting homogenous, stable protein preparations for crystallization. These tools have also been used to derive preliminary information about the protein * Corresponding author. Fax: +33 238 63 15 17. E-mail address: cad
[email protected] (M. Cadene). 1046-2023/$ - see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.ymeth.2008.10.021
structure, information which has sometimes proved to be precious at various stages of the structure determination process. Methods for the mass spectrometric analysis of membrane pro teins have evolved from requiring high amounts of material [2–5] to more reasonably scaled approaches in the past few years [6,7]. Upon purification, the membrane protein finds itself embedded in a lipid-detergent shell arranged in a toroid shape. The basic issue for MS analysis is to smoothly transfer the protein from this mem brane surrogate toroid to the solvents used in the analysis, without provoking precipitation or aggregation. Methods based on electro spray ionization generally still require a purific ation step to rid the solution of detergents, which are largely incompatible with this type of ionization (for an example, see [8]). MALDI ionization is amenable to direct analysis without prior removal of the deter gent, provided the latter has been selected to be compatible with the ionization and sample deposition method. Ionic detergents are still largely incompatible with MALDI ionization, in which case one should consider buffer-exchanging the detergent for a zwitterionic or non-ionic one which will be compatible with MS. With the lat ter two types of detergents, the ultrathin layer method is a good tool to operate this smooth transfer of the protein from the buffer solution to the matrix–analyte mixture solution, then on to sam ple spotting and solid-state co-crystal arrangement on the sam ple plate. Fig. 1A shows the signals obtained in the analysis of a
G. Gabant, M. Cadene / Methods 46 (2008) 54–61
55
A 25
Intensity (a.u.)
dried droplet
2+ 3+
1+
ultrathin layer
4+ 0
10000
20000
B
m/z
30000
40000
C
Fig. 1. (A) MALDI-TOF mass spectra of bacterial Silicibacter pomeroyi sodium channel, 300 fmol/lL in 4HCCA/FWI (6 pmol/lL in buffer with 10 mM DM), see text for experi mental details. a.u., arbitrary units. Spots were obtained using the dried droplet (upper spectrum) or the ultrathin layer (lower spectrum) methods. The number of charges is indicated on top of the peak for each charge state observed for the protein. The upper spectrum is offset for display purposes. (B) MALDI spots on gold-plated sample plate, spotted by hand (from left to right): dried droplet, ultrathin layer. (C) Edge of the ultrathin layer on the MALDI sample stage, with the plate angled towards light.
sodium channel using the dried droplet [9] compared to the ultra thin layer method [6], while all other conditions were kept identi cal. The photographs below (Fig. 1B and 1C) show the aspect of the spots and the ultrathin layer itself. In the authors’ as well as others’ hands, sandwich or regular thin layer methods [9,10] do not afford similar jumps in sensitivity and resolution. Although the set of methods presented here has been applied to a wide range of membrane protein types (b-barrel transporters, a-helix channels and transporters [11,12], receptors such as GPCRs [6]), examples given herein will focus on ion channels. 2. Ultrathin layer MALDI spot preparation method The protocol for the ultrathin layer method was first published in 2000 [6], and was the object of a videopublication in 2007 [13]. It is derived from classical thin layer approaches [9,10]. This opti mized varia tion was found to be particularly efficient and robust for the analysis of integral membrane protein mixtures in the pres ence of concentrations of detergents usually found in stabilization buffers, as well as reasonably high concentrations of additives used in the biochemical prepar ation of proteins, such as imidazole, NaCl, even urea. Table 1 shows a list of common buffer compo nents as well as their useable concentrations in buffer, provided that the 1/20 dilution in matrix solution is respected. Interestingly, the method has showed great versatility in that it can be used also for soluble protein mixtures in the ‘FWI’ sol vent described below. Soluble proteins however do not fare well if one completely removes water from this solvent (Steven L. Cohen, personal communication). The ability to analyze soluble proteins concomitantly with membrane proteins is an important feature for the detection of soluble protein contaminants, soluble cleav
age fragments, or soluble co-expressed domains. One of the main strengths of the method is that it limits the intensity bias in favor of smaller molecular ions observed in other methods, so that a whole protein can be analyzed concomitantly with cleavage products of any size above 700 Da, whether they are hydrophobic or not. This is an important feature for applications such as limited proteolysis experiments. Finally, the relative homogeneity of the signal over the whole sample spot greatly facilitates data acquisition. The general protocol flow is displayed in Fig. 2. The application of a very thin matrix layer all-over the sample plate is usually mas tered by operators within 1–3 trials. It should be stressed that care and timing are an important part of success in this protocol. Conse quently, the following steps should not be rushed through. Notes: Microtubes should never be siliconized or made of col ored plastic. Unless otherwise stated, all solvents are high purity HPLC-grade, ‘ultrapure’ water is prepared by a reverse osmosis/ ultrafiltration system (resistivity over 18 MX). Methanoic acid 88% (v/v) is referred to as simply “formic acid” in the text. The 4HCCA1 matrix is recrystallized grade (for example from Applied Biosystems, or Bruker Daltonics) or reagent grade (Sigma–Aldrich) purified by acid–base precipitation in the laboratory, and is resus pended in acetonitrile for aliquoting. Aliquots are prepared by withdrawal from the stock with constant resuspension (using a cut micropipette tip), dried in a Speed-Vac and stored with caps firmly in place at room temperature in the dark. One gram of matrix can thus be aliquoted in »100 tubes. Aliquots are stable for a few weeks at room temperature and at least one year at ¡80 °C. If frozen for 1 Abbreviations used: 4HCCA, 4-hydroxy cyano cinnamic acid (alpha-cyano 4-hydroxy cinnamic acid); CMC, critical micellar concentration; MALDI, matrixassisted laser desorption ionization; TFA, trifluoroacetic acid; TOF: time-of-flight.
56
G. Gabant, M. Cadene / Methods 46 (2008) 54–61
Table 1 Tolerances for common membrane protein buffer components in ultrathin layer MALDI-TOF analysis Name of component
Maximum final concentrationb Maximum final concentration in buffer, in matrix-analyte mixture solution, i.e. after i.e. before 1/20 dilution into the saturated 1/20 dilution into the satur ated matrix matrix solution solution
Note
Imidazole
50 mM
1M
Urea
80 mM
1.6 M
NaCl, KCl
15 mM
300 mM
Used in the purification of overexpressed His-tagged proteins Chaotropic agent, used for denaturation/ renaturation Na+ and K+ adducts will be observed in MS when NaCl concentration is above 50 mM in buffer before dilution into matrix solution
Tris-HCl or similar organic buffer Glycerol, DMSO
5 mM 0.25 %
100 mM 5%
Phosphate buffer, PBS
should be avoided altogether
Non-ionic detergents n-Dodecyl-ß-D-maltoside DDM n-Decyl-ß-D-maltoside DM n-Octyl-ß-D-glucopyranoside ßOG
0.5 mM 0.5 mM 1.5 mM
10 mM 10 mM 30 mM
CMCa 0.17 mM 1.8 mM 19 mM
Zwitterionic detergents Lauryldimethylamine oxide LDAO Zwittergent 3-12 N-dodecylphosphocholine
0.5 mM 0.25 mM 0.25 mM
10 mM 5 mM 5 mM
CMCa 1.5 mM 2-4 mM 1.5mM
Ionic detergents
should be avoided altogether
Non-volatile compounds interfere with matrix crystallization Phosphates interfere with ionization. Buffer exchange is recommended
Table expanded from Cadène and Chait [6]. a Critic al micellar concentration, CMC, values provided by Anatrace, Inc. and Calbiochem, Inc. b Concentrations exceeding these values will prevent proper formation of matrix-sample cocrystals. If higher detergent concentrations are required (e.g. for bOG), the sample dilution step into matrix solution should be raised proportionally.
storage, aliquots have to be warmed up to room temperature (RT) before use to prevent hygroscopic collection of water onto the matrix. 2. A. Preparation of satur ated matrix solution for sample dilution This solution is used to dilute samples immediately before spot ting onto the sample plate. While this is only used after making the thin layer, the solvent used to make the saturated matrix solution has to be prepared beforehand. A1: Preparation of matrix solvent: The FWI solvent mixture used for membrane proteins (consisting of 3:1:2 formic acid:water:iso propyl alcohol (v/v)) has to be prepared at least 4 h in advance. The reason for this is unknown; however, using freshly prepared FWI is the most common cause for failure. Once prepared, the solution can be stored for weeks in a clean scintillation vial, washed of all Triton used in the manufacturing process. Aluminium-lined lids should
dilute sample protein 1/20 in a solution of 4HCCA saturated in FWI
sample plate
spot a 0.5 L aliquot wash
(0.1%TFA)
m/z
apply a very thin layer of 4HCCA matrix
MALDI-TOF MS
Fig. 2. Ultrathin layer method deposition scheme for MALDI analysis.
not be used to avoid solvent contamination by copper. Because of the high formic acid content, it is advisable to store the FWI stock in a ventilated fume hood, with a tightly closed plastic lid. A2: Preparation of saturated matrix solution for sample dilu tion: to a dried matrix aliquot, »100–150 lL of FWI solvent is added. The solution is pipetted up and down 10–15 times and vor texed at maximum speed for 50–60 s. This ensures thorough mix ing, which is critical. If the solution is not clearly saturated, i.e. if all or nearly all the 4HCCA matrix has been solubilized in the sol vent, the solution is pipetted back up and added to another aliquot of dried matrix. The suspension is mixed thoroughly as above. A cloud of suspended matrix should be clearly visible upon shaking the tube. A3: The suspension prepared in A2 is centrifuged in a table-top microcentrifuge for at least 6 min at 13,500 rpm (»14000g for a typical 7 cm mean radius rotor). Careful centrifugation ensures the removal of matrix micro-aggregates which readily precipi tate membrane proteins, leading to a dramatic loss in sensitiv ity. Speed-start table microfuges are not appropriate for this step because the centrifugal force is insuf ficient to pellet matrix microaggregates. After centrifugation, taking care not to disturb the matrix pellet, one gently pipettes up the supernatant and transfers it to a fresh tube. The tube is labeled with the matrix and solvent names (4HCCA/FWI). This solution can only be used on the day it was prepared. 2. B. Preparation of diluted matrix solution for the plate B1: First, a saturated matrix solution is prepared using a dried matrix aliquot as described in step A, using »150 lL of TWA sol vent (TWA: 1:1 water:acetonitrile (v/v) with 0.1% final TFA). The saturated matrix solution is transferred to a fresh tube labeled 4HCCA/TWA. This solution should be used in step B2 on the day it was prepared.
G. Gabant, M. Cadene / Methods 46 (2008) 54–61
B2. The saturated 4HCCA/TWA solution is diluted 1/4 using iso propyl alcohol. This provides the diluted matrix solution for the plate. The sample plate solution is stable for months in the dark at room temperature in a properly sealed tube. 2. C. Sample plate preparation C1. It is recommended to prepare a clean working surface on the bench for the sample plate. The sample plate can be a gold-plated or a stainless steel plate. A gold surface simply makes visualiza tion of ultrathin matrix layers easier. Plates with recessed wells should not be used. Plates with lightly engraved spot rings or no engraving at all are best suited for this method. This is the only step that should be undertaken under a fume hood, preferably wearing gloves. Wash steps are as follow: reagent grade methanol (MeOH) is squirted onto the top of the plate from a squeeze bottle, and wiped with a lint-free tissue wipe. Then the plate is washed by alternat ing deionized water with reagent grade MeOH, always wiping the solvent with a lint-free wipe and ending with a MeOH step until the plate is spotless. Acetone can be used to dry cracks and holes on the back of the plate, but should never be used on the working surface of a plate, as the stripping power of acetone creates a slip pery surface which impairs the deposition of matrix solutions. C2: The plate is let to warm back to room temperature on the bench (solvent evaporation in step C1 causes the plate to cool down). If the plate has been warmed to speed up drying, it is very important to let it cool back to RT on the bench before proceeding to the next steps. In 5–10 min, the plate is both completely dry and at RT. 2. D. Applying the ultrathin layer of matrix onto the plate D1: The working area is prepared, with lint-free wipes and a protein-gel pipette tip at hand. Regular pipette tips can be used, provided they are big enough and notch-free. As described below, a volume of 20 lL of the sample plate solu tion prepared in Section B will be suf ficient to cover roughly 20 cm2 of plate surface. If using standard 384 spot plates (usable surface 10 £ 7 cm), the plate can be covered in four quarters. Smaller plates (such as the Applied Biosystems 96-spots plate, usable sur face 4.5 £ 4.5 cm) can be covered in one go. The outer 1 cm from the edge of the plate is left free of matrix to minimize transfer of matrix to the instrument plate holder. This also guards from depositing spots in areas where edge effects can alter the quality of mass calibration. D2: About »20 lL of diluted matrix solution is applied to the plate. The solution is immediately spread onto to the surface of the plate using 2–3 large strokes. It is important not to “paint” the plate as this will result in non-homogenous deposits. The matrix solution starts drying immediately due to the high isopropyl alco hol content. The “wiping tool” is prepared by curling a wipe around an index finger. A maximum of one or two rounds is preferable for good fingertip sensitivity to surfaces. The free wipe end can be trapped with the thumb. D3: When the matrix solution on the plate is almost dry, the small amount of remaining moisture is gathered with the wipe around a fingertip and tapped over the whole matrix surface to redistribute moisture. Then the matrix is wiped off the plate using the wipe on the finger, applying fairly high pressure and going in 3–4 long strokes. The matrix layer should be wiped until the amount of matrix left on the plate is barely visible as a yellowish reflection when the plate is angled towards the light. On stainless steel, only the outer edge of the matrix layer is visible. The goal is to produce a homogen ous, ultra-thin layer of matrix, as this will be the seeding bed for matrix–analyte co-crystals formation.
57
The quality of the ultrathin layer can immediately be tested using a saturated solution of matrix (for example, using the 4HCCA/FWI solution prepared in step A). A whitish, uniform matrix spot should start forming within 15–30 s (Fig. 1B). In extreme atmospheric conditions of temperature and/or humidity, the crystallization on the plate could take a little longer. As soon as the matrix crystal layer is uniform, the excess solvent droplet is aspirated using a vacuum line with a pipette tip on the end. The ultrathin layer on the plate can be used for months, as long as sam ples are spotted using the same matrix (e.g., 4HCCA matrix–pro tein solution onto 4HCCA layer). Sinapic acid can also be used to make an ultrathin layer, and sinapic acid–protein solutions spotted onto the layer using the same protocol (higher sinapic acid concen trations are required to reach saturation). Sinapic acid matrix how ever may produce matrix–protein adducts, whereas 4HCCA matrix does not when using the ultrathin method. 2. E. Sample preparation and deposition E1: Sample preparation. The method is relatively robust and tolerant to contaminants within the limits indicated in Table 1. However, it is not compatible with ionic detergents, or very large concentrations of any given reagent. Buffer exchange (through gel permeation–exclusion chromatography) should be considered if contaminant concentrations exceed the tolerances of the method. The analysis of membrane proteins is achieved by keeping the protein/contaminants ratio to an optimum. On one hand, the deter gent concentration in the sample should be at least 1.5–2 times the CMC for protein stability, and there is a practical limit to the pro tein concentration. This sets an upper limit to the protein/deter gent ratio in the sample up until the analysis. On the other hand, the protein concentration after dilution in the matrix solution has to be high enough to achieve good signal and the detergent con centration should be low enough to prevent interference. Since the protein and contaminants are diluted at the same time into the matrix solution, there is an optimum factor of dilution to achieve good sensitivity with low contaminant interference. In other words, the dilution factor has to be a compromise between low contam inant (one has to dilute enough) and high protein concentrations (one cannot dilute too much). It is thus really important to dilute the protein sample 1/20 in saturated 4HCCA/FWI matrix solution (as prepared in A). If high detergent concentrations are needed to reach the CMC, and buffer exchange is not feasible, then the pro tein sample should be further diluted into the matrix solution. The final protein concentration should remain above 300 fmol/lL for each protein species in a mixture, meaning the stock protein solu tion should have at least » 6 pmol/lL of each protein species, or higher if the dilution factor needs to be increased. Typically, 9.5 lL of 4HCCA/FWI matrix solution is added to 0.5 mL microtubes (transparent, silicone-free) and the tube lids immediately closed to prevent evapor ation. Then 0.5 lL of the protein solution to ana lyze is added to the 9.5 lL of matrix solution, pipette-mixing 15–20 times into the solution and the mixture is carefully vortexed at a low enough speed to prevent dispersion of the mixture onto the microtube walls and lid. E2: Sample deposition. Once the protein–matrix mixture solution is prepared, it should be deposited onto the sample plate within ten minutes. In our experience, longer exposure of the protein to the formic acid in the 4HCCA/FWI mixture will result in serine and threonine formylation. If the formic acid stock is low grade and contains peroxides, oxidation may also be observed. This is usually not a problem with typical, reagent grade formic acid. To spot the protein sample onto the plate, 0.5 lL of the pro tein–matrix mixture prepared above is deposited on the plate, tak ing care not to touch the plate with the pipette tip so as not to scrape the thin layer off the plate. The spot starts forming from the
58
G. Gabant, M. Cadene / Methods 46 (2008) 54–61
outside of the droplet and reaches homogeneity within 15–30 s, and appears as whitish and opaque. In some cases, particularly if the detergent is highly concentrated, a small void in the center of the spot can remain after the rest of the spot has crystallized prop erly. This does not preclude analysis. As soon as the spot has formed to homogeneity, the excess liquid is aspirated with a vacuum line and the spot is allowed to air-dry. E3: Washing. The samples can be washed using room tempera ture 0.1% aqueous TFA (note: for hydrophilic proteins and peptides, washes should be avoided or performed with ice-cold 0.1% TFA). A 1–2 lL droplet of 0.1% TFA is applied on top of the spot. After a few seconds (up to 15 s), the wash is aspirated with a vacuum line. The spot is allowed to air-dry: it is now ready for mass spectrometry analysis. If the sample spot was prepared with FWI, formylation may start to occur if the spot is not analyzed within the hour. Note: if on first attempt, the spot does not produce useable sig nal in MALDI-TOF MS, a higher dilution step in 4HCCA/FWI solu tion (e.g., 1/40 or 1/60) should be attempted. This has a far better rate of success than trying a lower dilution factor. Typic ally, the contaminants preclude ionization and thus greater dilution will work better. If a highly concentrated protein can be prepared, a 1/100 dilution can be employed without problem. E4: Instrumental setup. The ultrathin layer method is useful for analysis of full-length proteins as well as mixtures of protein and cleavage products, with length varying from a few amino acids to nearly full-length protein. Ioniz ation can be performed with any instrument equipped with a MALDI source, although the most common application is with MALDI-TOF instruments. It performs well with MALDI-TOF-TOF, MALDI-ion trap and MALDI-Qq-TOF configurations, provided that the laser power is set to about 10–15% above the ionization threshold. The ultrathin layer requires less laser power than the dried droplet method. Since a given position on a sample spot will be dug through within about 50 shots, it is
A
r ecommended to stay with laser shot frequencies below 20 Hz, and to move the laser target position every 25 shots. For whole pro teins, the best results are obtained with less than 5 Hz acquisitions. Good statistics is obtained with 150–200 laser hits. The method calls for relatively low laser strengths, which favors higher reso lution, limits the formation of artefactual gas-phase dimers, and apparently eliminates the formation of 4HCCA matrix adducts (sinapic acid adducts can still be observed when this matrix is used instead of 4HCCA). With careful internal calibration, masses can be measured with statistical accuracies of 30–150 ppm in linear TOF mode (using several charge states for full-length protein or large folding domains), and 5–10 ppm in reflector TOF mode (for masses below 5000 Da). Finally, this method has been routinely used in our group as well as several others for the analysis of peptide mixtures in the context of proteomics peptide mapping and modifi ed peptides characterization [14–20]. In this case, good results are often obtained using standard WA (2:1 water:acetonitrile; v/v) or TWA (2:1 water:acetonitrile (v/v) with 0.1% final TFA) solvent solutions for the matrix solution. 3. Role of initiator methionine(s) in protein stability and protein function In routine structural biology work, over-expressed proteins are first checked for proper primary sequence using SDS–PAGE. This provides a rough estimate of the molecular mass, and indicates whether the protein preparation is homogen ous. If the protein preparation shows more than a single thin band, the exact nature of the heterog eneity should be assessed using MS to precisely iden tify the origin of the heterog eneity. Homogen ous protein prepar a tions are desirable to favor the formation of crystals appropriate for X-ray diffraction. As the example below will show, a gel band at a lower apparent mass does not always translate into a proteolytic
C 417
M240 (2+)
60
M240
Intensity (a.u.)
N 1
0
B
9500
(3+)
12500
m/z
N 1
pore
M107 M107
RCK domain
C 417
C 340 C 340
Fig. 3. Coexpression of potassium channels with their own regul atory C-terminal domains. (A) MALDI-TOF mass spectrum of the coexpressed Escherichia coli potassium chan nel proteins. a.u., arbitrary units. (B) Model for channel-RCK domain arrangement in the Methanoccoccus thermoautotrophicum calcium-gated potassium channel (adapted from Jiang et al. [19]). A single gene gives rise to two gene products, and both products are required in the assembly of a functional K+ channel.
G. Gabant, M. Cadene / Methods 46 (2008) 54–61
cleavage fragment. In fact, careful MS analysis can provide a piece of information on this extra band which is important not only from a structural but also from a functional standpoint. The Escherichia coli potassium channel showed such a lower band on SDS–PAGE [21]. Mass spectrometry identified the full-length channel, along with a lower mass species (Fig. 3A). This second species would have naturally been hypothesized to be a cleavage fragment of the protein. However, no N-terminal cleavage product matched the observed mass within acceptable measure ment error, when the accuracy of the method with careful internal calibration placed the expected error at around 3 Da. For a C-termi nus-containing cleavage product, we found a match for a sequence starting at methionine #240, suggesting a cleavage after threonine #239, a rather unusual amino acid for proteolysis, even if acciden tal. This observation prompted us to consider other interpretations for the observed mass, in particular the presence of a secondary site for the initiation of translation. If a second initiat ion site on the transcript (or on a second transcript) is used by the translation machinery, then it is possible to observe the simultaneous expres sion, or co-expression, of two different protein species starting at two different methionines in the sequence. In the case of the E. coli potassium channel, the identity of the species starting at M240, apparently coeluting with the full-length channel, was confirmed by Edman sequencing. Taking this information into account, this C-terminal domain, corresponding to the RCK (regulator of chan nel conductance) domain of the channel, was crystallized inde pendently from the full-length channel [21]. The Methanoccoccus thermoautotrophicum calcium-gated channel exhibited similar behavior: the gene apparently possessed a secondary transla tion initiation site with co-expression of the regulatory domain. In an attempt to facilitate crystallization by producing the most homogen ous protein preparation possible, this second methio nine was mutated out. However, the mutant protein proved to be unstable, prone to aggregation, and unsuitable for crystallization. Roderick Mackinnon and his team finally achieved the resolution of the structure of the channel co-expressed with its own regula tory C-terminal domain, elegantly providing the explanation for the behavior of the mutant (Fig. 3B) [22,23]. The RCK domain has to be present in stoichiometric amounts to dimerize with the cor responding domain in the full-length channel. In its absence, the full-length channel is left to dimerize with itself, a quaternary structure which is incompatible with proper insertion of all the transmembrane segments into the membrane (Fig. 3B) and ulti mately leads to inactivity. The identific ation of a second initiator methionine was one of the key elements in the elucidation of the channel mechanism with respect to gating. The example above should thus serve to stress the importance of a proper inventory of all functionally relevant translation initiation sites. 4. Limited proteolysis for delineating folded protein domains (domain determination) The use of limited proteolysis to delineate folded protein domains is well documented and has long been employed to assist in the design of protein constructs suitable for crystallization (for a review of domain determination by limited proteolysis combined with mass spectrometry, see [24]). First, it shows whether a pro tein is correctly folded when expressed according to the predicted open reading frame. If the initial sequence turns out to be prob lematic, domain determination is a great help in devising strate gies for expression of a suitable construct. It should be noted that construct production by automated manipulation of the gene sequence, while complementary to domain determination, does not yield the same detail of information. For example, choos ing appropriate locations for the beginning of the sequence may require knowing naturally folded domains. A protein sequence
59
Table 2 Suggested initial enzyme:substrate ratios for setting up limited proteolysis exper iments Proteinase (sequencing grade)
Proteinase: protein ratio (w/w)
Endoproteinase LysC Endoproteinase ArgC Endoproteinase AspN Endoproteinase GluC (V8) Trypsin Chymotrypsin Subtilisin Proteinase K
1:2000 1:500 1:1000 1:200 1:100 1:400 1:4000 1:2000
starting in the middle of a folded domain is more likely to give rise to a misfolded product. A loose C-terminal stretch of sequence will be more sensitive to adventitious proteolysis during expression and purification. On the other hand, one sometimes may decide to keep a floppy loop or extremity in the sequence because it is important for expression or protein function. Domain determina tion provides the structural biolog ist with the level of information necessary to make these decisions on a rational basis. Limited proteolysis is conducted in approximately single-hit statistics conditions so that, on average, only one cleavage occurs per protein chain. This ensures that the general structure of the protein is conserved while producing a limited number of “nicks” on the protein backbone. The approach is akin to a footprinting of solvent-accessible peptide bonds. The choice of enzymes working in conditions where the mem brane protein is in its native state is usually quite large. It is advis able to start experiments with proteinases with narrow specificity, i.e. with proteolytic activity against one or two amino acid types. The most common are endoproteinase LysC, ArgC, AspN, GluC, and trypsin. The information obtained with specific proteases can then be used to facilitate the interpretation of cleavage results for lower specificity proteases such as subtilisin or proteinase K. Table 2 shows enzyme: substrate ratios that have been optimized to be used as a starting point when designing limited proteolysis experiments. These ratios are based on numerous limited proteol ysis experiments on integral membrane proteins, while the ratios suggested by manufacturers are given for the purpose of total (not limited) proteolysis, and are aimed at proteins in general, not membrane proteins. The different ratios also reflect the wide range of proteinases specific activity. All proteases should be sequenc ing grade. The membrane protein substrate should be as homoge nous as possible. Heterogeneity in the starting material can lead to misinterpretation of the result as cleavage products can arise from the legit substrate or from a contaminant species. The kinetics of cleavage is monitored using MALDI-TOF MS, using the ultrathin layer method described above. Cleavage sites are then ranked according to how early they appear in the experiment, roughly dividing into “major” and “minor” sites, corresponding to “early” or “late” sites, respectively. A proteinase sensitivity map can then be drawn over the primary and/or secondary structure of the protein. By repeating proteolysis experiments with various proteinases, it is possible to get a fairly detailed picture of the folded domains. This approach was for example successfully applied to the voltagegated potassium channel [25]. Limited proteolysis can also assist in the choice of the detergent for optimum protein stability. Beyond the design of a stable, properly folded protein construct, once diffraction data has been gathered, domain elucidation can provide interesting clues when building the structure model. On the E. coli chloride channel, a number of such experiments had been performed. Fig. 4A shows the cleavage sites identified for a subset of two of the proteinases employed for this task. Topology prediction graphs drawn at the time showed the putative trans membrane helical segments, consisting of 12 helices of roughly
60
G. Gabant, M. Cadene / Methods 46 (2008) 54–61
B
A
K2
K465
K2
K465
Fig. 4. Major cleavage sites in a subset of limited proteolysis experiments on the Escherichia coli ClC chloride channel. Green, endoproteinase ArgC cleavage. Orange, chymo trypsin. (A) Plot of major cleavage sites onto the algorithm-predicted helix arrangement. (B) Plot of major sites onto the helix arrangement as observed in the crystal struc ture. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
5. Time-dependent proteolysis to obtain a homogenous protein segment One of the main obstacles to obtaining highly diffracting pro tein crystals is the presence of a micro-heterogeneity of sequence, usually due to adventitious proteolysis or a post-translational modification. Proteinases can be employed to tailor the protein to a well-folded set of domains. In this case, optimum conditions such as the choice of enzyme, the enzyme:substrate ratio, the temperature and duration of cleavage may be different from the optimum conditions found for domain determination by limited proteolysis (see Section 1.3). By definition, one now aims at a complete cleavage, limited to one or two areas of the protein, usually close to the termini. The kinetics of cleavage is again monitored using MALDI-TOF MS and the ultrathin layer method. In favorable cases, it is possible to find a set of conditions which will produce a homogenous, well defined protein species (for an example, see [26]). 6. Selenomethion ine incorporation for MAD analysis For X-Ray structure determination, phase determination can be achieved by isomorphous replacement experiments using heavy atom derivat ives such as methyl mercury. However, the
30
Intensity (a.u.)
equal length, suggesting neatly bunched segments in rhodop sin-like fashion. When plotting the observed cleavages onto the putative helix arrangement, some odd results where immediately evident, with unexpected cleavages in the middle of helices, or no cleavage within obviously exposed loops (Fig. 4A, emphasized in red). This apparent conflict provided clues that the topology prediction plot was probably misleading. Fig. 4B shows the pro teinase target sites situated onto the actual helix arrangement as determined by X-ray diffraction [26]. The structure showed that helices differ in number and length from the prediction, and that some actually lay across the membrane in near-horizontal direc tion [26]. The orientation of polarized helices is integral to the chloride channel’s mechanism of ion transport across the mem brane. Although secondary structure prediction algorithms for mem brane proteins may have somewhat improved in the past 6 years thanks to the accumulation of structure data, they should still be used with circumspection and cross-checked using limited prote olysis.
Methionine
Seleno-Met.
0 19800
20000
20200
m/z Fig. 5. Selenomethionine replacement in Escherichia coli potassium channel RCK domain. The upper spectrum is offset for display purposes. Singly charged species are shown. a.u.: arbitrary units. D(m/z) is the difference between the observed m/z ratios for the native and selenomethionine-replaced protein species. It is used to calculate the difference in relative molecul ar mass between the selenomethio nine- and methion ine-containing protein species, taking into account the protein charge state z: D(m/z) £ z = DM = MSelenoMet ¡ Mmet. The molecular mass difference DM is divided by 46.9 to calculate the number of replacements. In this example, D(m/z) = 188.9 and z = 1, hence DM = 188.9 Da. Thus the number of replacements is 188.9/46.9 = 4.03. This peak corresponds to a protein pool where all four methio nines in the protein domain have been successfully replaced.
resulting derivatized membrane protein becomes even more hydrophobic, rendering analysis of the protein in solution or of the dissolved crystal quite dif ficult [8]. The MALDI-TOF MS method described above can sometimes be successful. For MAD (Multi-wavelength Anomalous Dispersion) phase determina tion experiments, methionines are replaced by selenomethio nines. Replacement of methion ine by selenomethion ine yields a mass shift of 46.9 Da, corresponding to the difference in average atomic mass between sulfur and selenium. MALDI-TOF MS can be used to check that metabolic replacement is suf ficiently com plete. As an example, Fig. 5 shows the mass shift corresponding to the number of replaced methionines for the E. coli potassium channel RCK domain. The major peak shows all four methio nines in the protein domain have been successfully replaced. Note that a minor fraction with only three replacements can be observed, meaning that at every site a small percentage of methi onine is present. This is because, although replacement occurs
G. Gabant, M. Cadene / Methods 46 (2008) 54–61
omogeneously across the sequence, the selenomethionine used h to produce the seleno-protein always contains a small percentage of methion ine and metabolic replacement may not be entirely complete. If p is the selenomethionine/methionine ratio at every site (also known as the occupancy), the proportion of partially replaced protein can be predicted using the binomial probabil ity equation: P(k) = (n!/k!(n ¡ k)!)pk(1¡p)n¡k, where n is the total number of methionines to be replaced, and k, the number of sel enomethion ines. In the example above, for p = 0.94, the propor tion of the protein pool with 3 out of 4 replacements would be: P(3) = (4!/3!(4 ¡ 3)!). p3.(1 ¡ p)4¡3 = 0.2 (20%). Typically, a p ratio (occupancy) over 90% is suf ficient for MAD analysis, in which case the major species in the spectrum will be the fully replaced protein. We similarly analyzed the full-length E. coli chloride channel protein using the above MALDI-TOF MS approach and found all 17 methionines to be replaced [26]. 7. Conclusion Mass spectrometry is a powerful and versatile tool offering deep insights into the state of the integral membrane protein the structuralist intends to crystallize. With appropriate sample prepa ration methods, it provides information that can sometimes prove critical at various stages of the structure determination process, from protein expression to model building. Acknowledgements MC wishes to thank all her collaborators, past and present, from the Roderick MacKinnon laboratory at the Rockefeller University, particularly Youxing Jiang, Raymund Dutzler, Alice MacKinnon, Alexander Pico, Vanessa Ruta, Ernest Campb ell, Jacqueline Gulbis, Motohiko Nishida, Sebastien Poget and João Morais Cabral. Many thanks to Steven Cohen and Julio Cesar Padovan for useful discussions. MC is also deeply grateful to Brian T. Chait and Roderick MacKinnon for years of wonderful science.
61
References [1] D.A. Doyle, J. Morais Cabral, R.A. Pfuetzner, A. Kuo, J.M. Gulbis, S.L. Cohen, B.T. Chait, R. MacKinnon, Science 280 (1998) 69–77. [2] K.L. Schey, D.I. Papac, D.R. Knapp, R.K. Crouch, Biophys. J. 63 (1992) 1240–1243. [3] P.A. Schindler, A. Van Dorsselaer, A.M. Falick, Anal. Biochem. 213 (1993) 256– 263. [4] V. Schnaible, J. Michels, K. Zeth, J. Freigang, W. Welte, S. Buhler, M.O. Glocker, M. Przybylski, Int. J. Mass Spectrom. Ion Process. 169–170 (1997) 165–177. [5] J.P. Whitelegge, C.B. Gundersen, K.F. Faull, Protein Sci. 7 (1998) 1423–1430. [6] M. Cadene, B.T. Chait, Anal. Chem. 72 (2000) 5655–5658. [7] Z. Ablonczy, R.K. Crouch, D.R. Knapp, J. Chromatogr. B. Analyt. Technol. Bio med. Life Sci. 825 (2005) 169–175. [8] S.L. Cohen, J.C. Padovan, B.T. Chait, Anal. Chem. 72 (2000) 574–579. [9] R.C. Beavis, B.T. Chait, Methods Enzymol. 270 (1996) 519–551. [10] F. Xiang, R.C. Beavis, Rapid Commun. Mass Spectrom. 8 (1994) 199–204. [11] M. Safferling, H. Grif fi th, J. Jin, J. Sharp, M. De Jesus, C. Ng, T.A. Krulwich, D.N. Wang, Biochemistry 42 (2003) 13969–13976. [12] Y. Huang, M.J. Lemieux, J. Song, M. Auer, D.N. Wang, Science 301 (2003) 616– 620. [13] D. Fenyo, Q.J. Wang, J.A. DeGrasse, J.C. Padovan, M. Cadene, B.T. Chait, J. Visu alized Experiments Cellular Biology, Issue 3 (2007). http://www.jove.com/ index/details.stp?ID=192. [14] C. Buré, S. Gof finont, A.F. Delmas, M. Cadene, F. Culard, J. Mol. Biol. 376 (2008) 120–130. [15] M. Trester-Zedlitz, A. Burlingame, B. Kobilka, M. von Zastrow, Biochemistry 44 (2005) 6133–6143. [16] C. Seibert, M. Cadene, A. Sanfiz, B.T. Chait, T.P. Sakmar, P.N.A.S. 99 (2002) 11031–11036. [17] R.W. Hepler, R. Kelly, T.B. McNeely, H.X. Fan, M.C. Losada, H.A. George, A. Woods, L.D. Cope, A. Bansal, J.C. Cook, G. Zang, S.L. Cohen, X.R. Wei, P.M. Keller, E. Leffel, J.G. Joyce, L. Pitt, L.D. Schultz, K.U. Jansen, M. Kurtz, Vaccine 24 (2006) 1501–1514. [18] W.M. Abida, A. Nikolaev, W. Zhao, W. Zhang, W. Gu, J. Biol. Chem. 282 (2007) 1797–1804. [19] V. Bourgeaux, M. Cadene, F. Piller, V. Piller, Chem. Biol. Chem. 8 (2007) 37–40. [20] E.J. Chang, V. Archambault, D.T. McLachlin, A.N. Krutchinsky, B.T. Chait, Anal. Chem. 76 (2004) 4472–4483. [21] Y. Jiang, A. Pico, M. Cadene, B.T. Chait, R. MacKinnon, Neuron 29 (2001) 593– 601. [22] Y. Jiang, A. Lee, J. Chen, M. Cadene, B.T. Chait, R. MacKinnon, Nature 417 (2002) 515–522. [23] Y. Jiang, A. Lee, J. Chen, M. Cadene, B.T. Chait, R. MacKinnon, Nature 417 (2002) 523–526. [24] S.L. Cohen, B.T. Chait, Annu. Rev. Biophys. Biomol. Struct. 30 (2001) 67–85. [25] Y. Jiang, A. Lee, J. Chen, M. Cadene, B.T. Chait, R. MacKinnon, Nature 423 (2003) 33–41. [26] R. Dutzler, E.B. Campbell, M. Cadene, B.T. Chait, R. MacKinnon, Nature 415 (2002) 287–294.