CHAPTER SIX
Modern Mass Spectrometry-Based Structural Proteomics Evgeniy V. Petrotchenko*, Christoph H. Borchers*,†,1
*University of Victoria—Genome British Columbia Proteomics Centre, Victoria, British Columbia, Canada † Department of Biochemistry and Microbiology, University of Victoria, Petch Building Room 207, Victoria, British Columbia, Canada 1 Corresponding author: e-mail address:
[email protected]
Contents 1. 2. 3. 4. 5. 6. 7. 8. 9.
Introduction The Concept of Structural Proteomics Limited Proteolysis Surface Modification Hydrogen–Deuterium Exchange Cross-linking Additional Mass Spectrometric Techniques for the Protein Structure Analysis Combination of Multiple Structural Proteomics Techniques Use of Experimental Structural Proteomics Constraints in Protein Structure Modeling 10. Future Directions 11. Conclusions Acknowledgment References
194 194 195 196 200 201 207 207 209 210 211 211 211
Abstract Recent developments in the modern mass spectrometry of proteins and peptides have resulted in significant progress in structural proteomics techniques for studying protein structure. A variety of protein structural questions, ranging from defining protein interaction networks to the study of conformational changes and the structure of single proteins, can be addressed using multiple mass spectrometry-based structural proteomics approaches. Each technique provides specific structural information which can be used as experimental structural constraints in protein structure modeling. Here, we describe recent developments in limited proteolysis, surface modification, hydrogen–deuterium exchange, ion mobility, and cross-linking—all combined with modern mass spectrometric techniques—for the studying protein structure.
Advances in Protein Chemistry and Structural Biology, Volume 95 ISSN 1876-1623 http://dx.doi.org/10.1016/B978-0-12-800453-1.00006-3
#
2014 Elsevier Inc. All rights reserved.
193
194
Evgeniy V. Petrotchenko and Christoph H. Borchers
1. INTRODUCTION The knowledge of protein structures is crucial for understanding of the functioning of biological systems in healthy and disease states. The recent revolution in mass spectrometry-based proteomics has revived an interest in the traditional protein chemistry methods for studying protein structure. The combination of methods such as limited proteolysis, chemical surface modification, hydrogen–deuterium exchange (HDX), cross-linking, and affinity labeling with mass spectrometry has led to the new field of structural proteomics. Each of these methods provides specific and unique information on the protein system studied. For example, limited proteolysis can indicate portions of folded proteins that are accessible to a large probe (a proteolytic enzyme) and must therefore be exposed to the solvent. Similarly, chemical surface modification can provide similar information on the accessibility of a particular amino acid residue to a smaller probe (a modification reagent). HDX of the amide protons in the peptide bonds can provide data on the hydrogen bonding status and accessibility, and, therefore, the presence of secondary structure elements in the protein sequence. Chemical crosslinkers of different lengths form a “molecular ruler” which can provide distances between the cross-linked residues (Fasold, Klappenberger, Meyer, & Remold, 1971; Green, Reisler, & Houk, 2001; Peters & Richards, 1977). Taken together, these methods provide a set of structural constraints on the folded protein or protein complex being studied. In this review, we describe the current status of these various structural proteomics methodologies, and their application to the elucidation of the structures of proteins and protein complexes, using the example of our recent study of prion protein conversion and aggregation (Serpa, Patterson, et al., 2013).
2. THE CONCEPT OF STRUCTURAL PROTEOMICS The concept of structural proteomics is to obtain multiple types of experimental data to characterize the structure of the protein system studied (Konermann, Vahidi, & Sowole, 2014; Serpa et al., 2012). Traditionally, as for proteomics in general, structural proteomics implies the use of mass spectrometry as the major instrumental tool. The recent revolutionary developments in the mass spectrometric analysis of peptides and proteins, which made proteomics possible, have also had an effect on advancement of structural proteomics. This new level of technology has opened the way for the
Modern Mass Spectrometry-Based Structural Proteomics
195
combination of traditional protein chemistry methods with novel mass spectrometric methods, allowing the characterization of the protein structure with amino acid residue resolution. Structural proteomics can be applied to the study of protein systems of varying levels of complexity. Certain types of protein structural information can be obtained for each level of organization of the sample, that is, single proteins, binary protein complexes, multisubunit protein assemblies, proteome-wide protein interaction networks, organelles, cells, and tissues. Thus, for a single protein, structural proteomics can supply information at the amino acid residue level which is often impossible to obtain by other methods, including distance information between functional groups, the degree of a specific amino acid residue’s exposure to the solvent, and its involvement in hydrogen bonding and secondary structure elements. These data can be used as constraints in the molecular modeling process for an unknown protein structure, or can be used to assess the dynamic and conformational changes of the protein. In the case of known interacting proteins, information pertaining to the spatial proximity of interacting groups and to the changes in the exposure of protein surfaces upon complex formation can be useful for elucidating details of the protein interaction interfaces. This information can also be used for establishing protein complex topologies in multisubunit protein assemblies. For entire proteome-scale applications, the identities and structural details of the interacting proteins and protein domains within protein complexes can be determined. In this review, we will primarily focus on our studies of the structures of single proteins and known protein complexes.
3. LIMITED PROTEOLYSIS The limited proteolysis method is based on short controlled exposures of the protein to a proteolytic enzyme. The first cleavage of the protein occurs while the tertiary and quaternary structure of the protein complex should still be preserved, so the initial cleavage sites should be restricted to the outermost regions of the protein subunit surfaces—that is, those are accessible to the active site of the proteolytic enzyme. Because most enzymes are globular proteins with molecular weights of at least 15–20 kDa, the location of the cleavage sites will reflect their accessibility to a nearly spherical probe, whose diameter corresponds to the size of the enzyme used. Several mass spectrometric approaches can be used for the determination of the cleavage sites. The most common approach is to first characterize the
196
Evgeniy V. Petrotchenko and Christoph H. Borchers
limited proteolysis reaction by SDS-PAGE. Serial time points for short (e.g., 1–5 min) exposures of the protein to the diluted proteolytic enzyme (e.g., 1:100 enzyme:substrate ratio) are used, and the reactions quenched and the products are separated by SDS-PAGE. The process of proteolysis can be visualized by time-wise appearance of the proteolytic fragments resulting from the enzymatic cleavage. The fragments that are first to appear can then be identified by in-gel digestion followed by peptide mapping, which will indicate the sites of cleavage (Fig. 6.1A). Alternatively, cleavage sites can be deduced from measuring the exact mass of the entire fragment by Orbitrap/ FTICR mass spectrometry and top-down MS/MS analysis. Using the former approach, we have characterized conformational changes occurring in course of native (PrPC) to aggregated state (PrPb) conversion of prion proteins. We demonstrated increased protection of the K110 cleavage site in PrPb compared to PrPC, suggesting intra- and/or inter-protein interactions; and increased cleavage at sites in the region of residues 149–156 in PrPb, but not in PrPC, which indicates increased exposure of the hydrophobic residues in this region (Fig. 6.1B). These changes indicate unfolding or rearrangement of the C-terminal portion of the amphipathic helix 1 (H1) during the PrPC to PrPb conversion process (Serpa, Patterson, et al., 2013).
4. SURFACE MODIFICATION Surface modification provides information similar to that obtained from limited proteolysis approach (i.e., protein surface accessibility), but the determined accessibilities are to a smaller probe, in this case, a modification reagent. The basis of this method is a chemical reaction of the protein with a water-soluble modification reagent. Chemical modification of the protein surface thus allows the determination of which regions of the proteins are exposed to the solvent. Although the microenvironment can have a significant influence on the reactivity of amino acid residues, it is mainly those functional groups that are solvent exposed (i.e., which are located on the surface of the protein molecules) which will be modified with amino acid specific reagents. Identification of these modification sites thus indicates which amino acid residues are on the protein surfaces and are in contact with the solvent. The regions of the protein that are internal or are involved in formation of interprotein contacts are shielded from the modification reagent and consequently remain largely unmodified. Thus, analysis of the distribution of the chemical modification sites for a multisubunit protein
A
Protein
Proteolytic enzyme Intact protein Limited proteolysis Protein cleavage products Digestion Peptides derived from protein cleavage products MS analysis
Identification of the cleavage site
62 49 38
30 min
5 min
1 min
0 min
30 min
10 min
5 min
1 min
0 min
30 min
20 min
10 min 20 min
PrPb
PrPC
188 98 undig
28 17 14 6
10 min
1 min
0 min
30 min
10 min
5 min
1 min
0 min
98
5 min
PrPb
PrPC
B
62 49 38 28 17 14
1 2
3
undig 1 3 5 2
4
6
Trypsin
4
Pepsin
Figure 6.1 Limited proteolysis. (A) Principle of limited proteolysis cleavage-site determination by peptide mapping of the cleavage products. Following a short controlled exposure to the proteolytic enzyme, cleavage products are separated by SDS-PAGE and are in-gel digested. Peptide mapping of the cleavage products indicates the cleavage site. (B) An example of a differential limited proteolysis study of the native (PrPC) and pathological (PrPb) states of the prion protein. Different patterns of limited proteolysis products were observed for the two states of the protein. Peptide mapping analysis of the cleavage products (indicated by the arrows) revealed that K110 and the aa149–156 region are differentially accessible in the two forms of the prion protein. Reprinted from Serpa, Patterson, et al. (2013), with permission.
198
Evgeniy V. Petrotchenko and Christoph H. Borchers
complex indicates which regions of protein surfaces are solvent accessible and which are “shielded” or “protected” because they are either buried or are involved in protein interaction interfaces of the protein complexes. This approach can be particularly informative for comparing two states of the protein, and can indicate changes in protection resulting from conformational changes and/or complex formation. In this type of differential experiment, surface modification is performed in parallel for two states of the protein and differences in reactivity of the particular amino acid residues, which can be quantified by mass spectrometry, will reflect their involvement in the protein’s structural changes. Differential chemical surface modification can benefit greatly from the use of isotopically coded reagents which behave identically during mass spectrometric analysis. Light and heavy isotopic forms of the modification reagents, chemically identical, but different in mass, will produce modification products of different masses because of the mass differences in the stable isotopes employed. If the light form is used for one conformational state, and the heavy form is used for the other, combining the two samples before mass spectrometric analysis provides a convenient method for relative quantitation of the modification reaction yields for both samples in the same mass spectrum and under the same instrumental conditions (Fig. 6.2A). We have also applied this approach for the characterization of the PrPC to PrPb conversion. The isotopically coded water-soluble amine-reactive modification reagent PCASS-H4/-D4 (Fig. 6.2B), developed by our group, was used to quantitatively determine differences in specific amino acid reactivities between PrPC and PrPb. Each form was modified with either the light or the heavy isotopic forms of the reagent so that differences in residue reactivities between the two prion isoforms could be determined from the ratios of the signal intensities of the light (H4) and heavy (D4) forms of the modified peptides. Several residues were found to be preferentially modified in the PrPC form: K110 (located on N-terminal portion of the protein), S132, S135, (located on the b1–H1 loop, residues 128–142), and K220, Y225, Y226, S231 on the C-terminal portion of H3 (residues 200–232). Changes in the reactivity of these residues, as a result of PrPC to PrPb conversion, may indicate involvement of these regions in intra- and/or interprotein interactions within b-oligomers. The increased modification of residues on the H1–b2/H2–H3 interface (Y128, Y149, Y150, Y157, Y163, Y169) in PrPb as compared to PrPC, also suggests a conformational change/rearrangement of this region, which was in good agreement with the limited proteolysis data (Serpa, Patterson, et al., 2013).
Modern Mass Spectrometry-Based Structural Proteomics
199
(Continued)
200
Evgeniy V. Petrotchenko and Christoph H. Borchers
Recently, we have expanded this approach for the use of isotopically coded hydrogen peroxide (H216O2 and H218O2) as the modification reagent. This allowed us to obtain an additional complementary set of differentially modified methionine and tryptophan residues between PrPC and PrPb (Serpa, Petrotchenko, Wishart, & Borchers, 2013), which was found to be in good agreement with PCASS modification results.
5. HYDROGEN–DEUTERIUM EXCHANGE HDX is based on the principle that protein backbone hydrogens can be exchanged with deuterium upon exposure of a protein to a D2O-based buffer. The exchange rates for individual peptide bond amide hydrogen atoms are dependent on the protein’s structure: tightly hydrogen-bonded segments undergo very slow exchange, while disordered regions exchange much more rapidly. The hydrogen bonding of the amide hydrogen in the amide bond of a particular amino acid residue may indicate its involvement in secondary structure elements and/or exposure to the solvent. Short controlled immersion of the protein or protein complex into a D2O solution will lead to the replacement of the exchangeable hydrogens on the protein surface with deuterium atoms from the solvent. Because deuterium is twice as heavy as hydrogen, the exchange can be readily detected and quantified by mass spectrometry. There are two general strategies to assess the location and extent of the exchange: bottom-up and top-down analysis. In the bottomup approach, the protein is quickly digested, usually with pepsin under conditions of low pH and low temperature at which the peptide bond amide hydrogen exchange rate is minimal. The peptides produced are then analyzed by mass spectrometry to determine the relative amount of exchange Figure 6.2—Cont'd Chemical surface modification. (A) Principle of differential surface modification amino acid residue reactivity determination using isotopically coded modification reagents. Proteins in two different states are modified with light and heavy isotopic forms of the reagent, respectively. Following quenching of the reaction, protein samples are combined, digested, and analyzed by mass spectrometry. Differentially modified peptides manifest in the MS spectra as pairs of signals separated by the mass difference between the light and heavy isotopic forms of the reagent used. The ratio of the intensities of the light and heavy forms of the peptides reflects the relative reactivities of the modified sites in the two states of the protein. (B) An example of a differential surface modification study of the prion protein in two conformational states, PrPC and PrPb. The isotopically coded modification reagent PCASS-H4/D4 was employed. Several residues showed differential reactivities between two forms of the protein. Reprinted from Serpa, Patterson, et al. (2013), with permission.
Modern Mass Spectrometry-Based Structural Proteomics
201
that has occurred. In the top-down approach, the intact protein is exposed to time-controlled incubation in D2O buffer, and is infused into mass spectrometer and analyzed by MS and MS/MS. We have developed this top-down method in combination with electron-capture dissociation (ECD)-FTICR MS (Pan, Han, Borchers, & Konermann, 2008, 2009). ECD is a rapid fragmentation technique that produces selective fragmentation of peptide bonds, avoiding hydrogen scrambling (i.e., migration of the amide hydrogens along peptide chain), and produces an extensive series of c- and z-ions covering the protein sequence, with most fragments differing by a single residue. By comparing the masses of the consecutive fragments in both the c- and z-series, the exchange rate at nearly single residue resolution can be determined (Fig. 6.3A). We have applied this approach to the analysis of both the secondary structures of intact proteins, as well as conformational changes, as in the case of the prion protein conversion from PrPC to PrPb. The protein solution is continuously mixed in a capillary—first with D2O, then with an acidic quenching solution, and then the solution is directly infused into the mass spectrometer. Using this approach, we determined that approximately 38 amides were protected from exchange in PrPC, while only 23 are protected in the misfolded PrPb form. In other words, 15 amides became unprotected when PrP changed from the monomer to the oligomer. The region of deprotection was localized to residues 148–164, which is the stretch of the protein sequence encompassing H1–b2 (Fig. 6.3B). HDX deprotection in this region indicates the loss of the secondary structure (melting of H1, disassembly of the b-sheet involving the b2 strand) and/or disruption of the H1–b2/H2–H3 interface (Serpa, Patterson, et al., 2013).
6. CROSS-LINKING The idea behind the use of cross-linking to determine a protein’s structure is straightforward: to introduce new covalent bonds between pairs of functional groups in the protein in order to identify cross-linked sites and—based on the length of the cross-linking bridges formed—to deduce the distances between these cross-linked sites (Petrotchenko & Borchers, 2010a). These distances, in turn, can be used as constraints in the protein structure model-building process, and/or as characteristic features of the protein’s conformational changes. The workflow in a typical “bottom-up” mass spectrometry-based crosslinking experiment involves cross-linking the protein(s) of interest, optional
202
Evgeniy V. Petrotchenko and Christoph H. Borchers
Figure 6.3 Hydrogen–deuterium exchange. (A) Principle of the determination of the amino acid residues deuteration status by top-down ECD–FTICR FTMS. Facile ECDfragmentation produces scrambling-free series of the c- and z-series of fragments at nearly every peptide bond in the protein. Comparing mass shifts between consecutive fragments in the series allows to us estimate the degree of H/D exchange for every peptide bond amide. Little or no exchange for the amino acid residue would indicate involvement of its particular amide proton in hydrogen bonding. (B) Example of the differential hydrogen–deuterium exchange study of the two states of the prion protein. Hydrogen–deuterium exchange patterns, observed for the PrPC and PrPb forms, indicates, that there is no significant difference in H-to-D exchange within the N- and C-terminal regions; however, there is a significant difference in exchange for fragments from the 148–164 regions. Reprinted from Serpa, Patterson, et al. (2013), with permission.
separation or purification of the cross-linked protein products, digestion of the cross-linked proteins into cross-linked and non-cross-linked peptides, and optional purification or enrichment of the cross-linked peptides (cross-links) (Fig. 6.4A). Finally, mass spectrometric analysis of the interpeptide cross-links (two peptides connected by the cross-linker bridge) leads
Figure 6.4 Cross-linking. (A) Principle behind the determination of the cross-linked sites. Cross-linked proteins are digested, cross-linked peptides are optionally enriched and analyzed by mass spectrometry. Several MS-oriented features of the cross-linking reagents facilitate detection and identification of the cross-links. Use of isotopically coded affinity-enrichable CID-cleavable reagent CBDPS-H8/D8 allows specific and sensitive detection and unambiguous identification of the cross-links. (B) Example of the differential cross-linking study of the native and aggregated states of prion protein. Several differential CBDPS cross-links are not compatible with the native structure of the protein and point to the nature of the conformational change, which leads to the aggregation. Reprinted from Serpa, Patterson, et al. (2013), with permission.
204
Evgeniy V. Petrotchenko and Christoph H. Borchers
to the identification of the component peptides and the cross-linking sites. Alternatively, the isolation of a small cross-linked protein can be done inside the mass spectrometer, with the cross-linking sites being localized by topdown FTICR MS (Kruppa, Schoeniger, & Young, 2003; Novak & Giannakopulos, 2007; Novak, Young, Schoeniger, & Kruppa, 2003). Despite the apparently straightforward nature of the chemical crosslinking approach, there are several significant challenges, such as low relative and absolute abundance of the cross-links, the combinatorial nature of the possible combinations of peptides that constitute each cross-link, and the generally higher molecular weight of these interpeptide cross-links. Up to now, these issues have prevented the routine and widespread use of this technique. Fortunately, many of these challenges have been addressed by recent developments in mass spectrometric instrumentation, as well as by the development of new cross-linking reagents (particularly isotopically coded cross-linking reagents), and—not insignificantly—by the development of specialized software for the processing of cross-linking data. Numerous cross-linking reagents have recently been designed to incorporate special features which facilitate downstream processing and mass spectrometric analysis. These features include affinity tags and charge groups to facilitate selective enrichment of the cross-links, and the incorporation of isotopic coding, mass defect groups, MS/MS reporter groups, and cleavage sites to facilitate mass spectrometric detection and identification (Paramelle, Miralles, Subra, & Martinez, 2013; Petrotchenko & Borchers, 2010a). In order to obtain shorter (and therefore tighter) distance constraints, reagents with broader reactivity such as homo- and heterobifunctional photoreactive and zero-length cross-linkers are currently being developed. The complexity of the resulting mixture of cross-linking products is expected to be even higher for nonselective cross-linking reagents, so this approach will require more sophisticated data processing approaches as well. Digestion with trypsin targets lysines and arginines, while lysine is also the target for amine-reactive cross-linking reagents. Because trypsin does not cleave at modified lysine residues, this combination often results in large cross-linked peptides. To circumvent this problem, double digestion (i.e., the use of an additional enzyme with a different specificity, such as GluC or AspN, has been proposed; Yan et al., 2009). We have recently reported on the successful use of the nonspecific enzyme proteinase K for generating “families” of interpeptide cross-links of an optimal size for mass spectrometric analysis (Petrotchenko et al., 2012).
Modern Mass Spectrometry-Based Structural Proteomics
205
Enrichment of cross-linked peptides—to separate them from the overwhelming background of non-cross-linked peptides—also facilitates the mass spectrometric detection and assignment of cross-links. Enrichment techniques include gel-filtration chromatography (because a typical interpeptide cross-links is larger than a linear peptide; Leitner et al., 2012), strong cation-exchange chromatography for the tryptic interpeptide cross-links (because tryptic interpeptide cross-links carry twice of positive charges compared to linear non-cross-linked tryptic peptides; Chen et al., 2010), affinity purification using tags which have been incorporated into the structure of a cross-linking reagent (usually a biotin group; Fujii, Jacobsen, Wood, Schoeniger, & Guy, 2004), functional groups for covalent capture (Buncherd et al., 2012; Chowdhury et al., 2009; Sohn et al., 2012; Yan et al., 2009), antigenic groups (Petrotchenko, Doant, & Borchers, 2006), and specific non-covalent interaction groups (Wang & Hakansson, 2008). These recently developed techniques have all been crucial for the successful detection of multiple cross-links. There are several additional techniques, which have been developed to improve mass spectrometric detection and identification of interpeptide cross-links. These include enzyme-mediated introduction of 18O isotopes during digestion (Back et al., 2002), N-terminal modification of the cross-linked peptides with isotopically coded reagents (Chen, Chen, & Anderson, 1999; Petrotchenko, Serpa, & Borchers, 2010) and the use of metabolically labeled proteins (Taverner, Hall, O’Hair, & Simpson, 2002). All of these techniques can produce “signatures” in the mass spectra that are specific to interpeptide cross-links. Last but certainly not least, the introduction of high-mass accuracy highperformance high-sensitivity instruments, such as FTICR-based mass spectrometers with multiple new and efficient fragmentation methods, has had a major impact on the progress of the cross-linking approach. Due to the previously mentioned combinatorial nature of the interpeptide cross-links, the accuracy of the mass measurements and facile MS/MS fragmentation are crucial factors for obtaining the correct assignments of the cross-links. The introduction of instruments such as the ThermoFisher Orbitrap with high-mass accuracy and sensitivity—and, importantly, which can be easily interfaced to HPLC—has been a significant breakthrough in detecting and assigning cross-links contained within the complex mixture of crosslinking reaction products. The large amount of mass spectrometric data produced in a typical crosslinking experiment requires specialized software tools for data analysis.
206
Evgeniy V. Petrotchenko and Christoph H. Borchers
A number of programs specifically designed for the processing of mass spectrometric cross-linking data have recently been developed (Mayne & Patterton, 2011). Available software packages can produce simple predictions of the masses of cross-linked peptides, or go all the way to proteome-wide MS/MS-based identification of the cross-links. Most often, cross-linking studies are focused on a known protein or protein complex, although cross-linking is starting to be used for the examination of protein interaction networks on the proteome-wide scale (Yang et al., 2012). In our laboratory, we usually employ the ICC-CLASS software package (Petrotchenko & Borchers, 2010b) for interrogation of the LC–MS/MS data from experiments using isotopically coded collision induced dissociation (CID)-cleavable cross-links. This software has components that have been tailored for the analysis of data from cross-linking experiments using specific techniques, such as 15N-labeling, isotopically coded N-terminal modification of the cross-links, etc. Of all the different types of data that can be determined by mass spectrometry-based structural proteomics, interresidue distance constraints is the most obvious type to be incorporated into protein structure modeling software (see below). We have used most of these recently developed methods in the crosslinking-combined-with-mass-spectrometry approach for the characterization of the prion protein conversion and for elucidating the structure of the resulting aggregate. Our first line of experiments is usually Lys–Lys cross-linking, using our isotopically coded CID-cleavable affinity-purifiable amine-reactive cross-linker CBDPS-H8/D8 (Petrotchenko, Serpa, & Borchers, 2011). Cross-linking was followed by proteinase K digestion, purification with avidin, and LC–MS/MS analysis on Orbitrap instrument (Fig. 6.4A). The use of proteinase K allowed unambiguous identification of the interpeptide cross-links in the prion protein, which has few tryptic cleavage sites and is resistant to enzymatic digestion by more-specific enzymes when in its aggregated form. We were able to detect and identify 13 crosslinks, some of which were preferentially found in either the PrPC or PrPb forms of the protein. Analysis of this data revealed that some of the crosslinks observed in PrPb (K185–K220 and K204–K220 in the H1–b2/ H2–H3 region) were not compatible with the NMR structure of PrPC, suggesting specific sites of conformational change or the formation of new interprotein contacts in PrPb (Fig. 6.4B). To further examine not only the conformational changes of the individual PrPb molecules but also the arrangement of the PrPb molecules in an
Modern Mass Spectrometry-Based Structural Proteomics
207
aggregate, we utilized an 15N metabolic labeling strategy to discriminate between intra- and interprotein cross-links (Taverner et al., 2002). To obtain tighter distance constraints, we used several zero-length cross-linking reagents. 15N-labeled PrP was produced and mixed 1:1 with 14N-PrP, followed by conversion, cross-linking and enzymatic digestion. While intraprotein cross-links produce only 14N–14N or 15N–15N paired peptides and a doublet signature, interprotein cross-links produce a series of four peaks: 14N–14N, 14N–15N, 15N–14N, or 15N–15N. Cross-links can be further confirmed by a characteristic “signatures” in the fragment ions. An in-house program called 14N15N DXMSMS Match was developed to analyze this kind of data. This strategy resulted in the identification of 13 intraprotein cross-links and 11 interprotein cross-links. The intraprotein cross-links confirmed suggested rearrangement of the b1–H1–b2 loop away from the H2–H3 interface, and the introduction of a segment of the N-terminal tail region into the core. Interprotein cross-linking has allowed for the first time, the experimental determination of the stacking of the prion protein monomers within the oligomer.
7. ADDITIONAL MASS SPECTROMETRIC TECHNIQUES FOR THE PROTEIN STRUCTURE ANALYSIS The collection of the structural proteomics techniques is enhanced by any other complementary mass spectrometry approach that can provide detailed protein structural information. Native ESI-MS, for example, can provide some information on the possible arrangements of the subunits in multicomponent protein assemblies, deduced from the dissociation pattern and the order of the proteins which “fall off” the complex (Marcoux & Robinson, 2013). Ion-mobility MS is able to provide specific conformational characteristics of the protein and protein complexes, derived from their measured cross-sectional areas (Konijnenberg, Butterer, & Sobott, 2013). All of this additional information can be used when models of the final protein structure or the protein’s conformational changes are being postulated.
8. COMBINATION OF MULTIPLE STRUCTURAL PROTEOMICS TECHNIQUES We believe that combining multiple structural proteomics approaches for the characterization of the proteins under study is crucial for solving protein structures. Although each method cannot provide complete structural
208
Evgeniy V. Petrotchenko and Christoph H. Borchers
information on its own, each method provides different and specific structural information on the protein. Thus, a combination of these multiple approaches may provide sufficient complementary information to derive the detailed protein structure. Results from different methods verify and support each other findings and, ultimately, provide confidence in the final result. For the prion protein study mentioned here, we have used limited proteolysis, chemical surface modification, HDX, and cross-linking as part of our collection of the structural proteomics tools. The data from these multiple approaches are in remarkable agreement and have provided a total of >30 residue-specific constraints, which collectively suggest that the rearrangement of the b1–H1–b2–H2 region is the major conformational difference between PrPC and PrPb (Fig. 6.5). A conformational change
Figure 6.5 Summary of the structural differences between PrPC and PrPb, as revealed by multiple structural proteomics methods. The residues, which are preferentially modified or cross-linked in the native PrPC and oligomeric PrPb samples, are highlighted in light grey and dark grey, respectively. The preferential pepsin cleavage site for PrPb is indicated by a light grey arrow. The region of the structure which loses protection from hydrogen–deuterium exchange in the PrPb sample is indicated by the light grey arc. The K185–K204, K185–K220, and K204–K220 CBDPS cross-links (light grey dashed lines) are present only in the oligomeric PrPb sample. The K185–K220 and K204–K220 crosslinks are incompatible with the native PrPC structure, which suggests a possible conformational change in the PrPb aggregated form of the protein. The data from multiple approaches collectively suggest rearrangement of the b1–H1–b2–H2 region in PrPb. Reprinted from Serpa, Patterson, et al. (2013), with permission.
Modern Mass Spectrometry-Based Structural Proteomics
209
in the H1–b2-rigid loop region and distortion of its contact with helices 2 and 3 would create new hydrophobic patches on the surface of the molecule, which, in turn, could be responsible for driving the aggregation process. Through analysis of the intra- and interprotein constraint data, only one possible dimeric structure was found that satisfied all of the constraints. Thus, this can be considered as the first experimentally determined structure for the early conversion and aggregation events in the prion protein’s misfolding process. This study of prion proteins has illustrated—and validated—the utility of applying an entire arsenal of structural proteomics methods to produce a detailed and comprehensive characterization of the conformational changes and the aggregation process. The results obtained thus far have encouraged us to propose a similar approach to the investigation of the structure of multiple protein systems.
9. USE OF EXPERIMENTAL STRUCTURAL PROTEOMICS CONSTRAINTS IN PROTEIN STRUCTURE MODELING The output of a mass spectrometry-based structural proteomics is a set of characteristics for the amino acid residues of the protein. To make sense of this array of experimental data, these pieces of information need to be translated into a final three-dimensional structure of the protein. Depending on the structural question for a particular study, differing numbers of experimental constraints may be needed to provide the answer. For example, a few long distance cross-linking constraints may indicate a global conformational change, if differentially observed between two conformational states of the protein. A small amount of data on the changes in amino acid residues exposure between a free and bound state upon protein complex formation may designate protein interaction interface. Likewise, a single cross-link can rule out a potential conformational model (Petrotchenko, Pedersen, Borchers, Tomer, & Negishi, 2001). The ultimate challenge, though, would be to automatically solve a problem in protein structure by simply inputting structural proteomics data. Unfortunately, to our knowledge, to date there is no turn-key protein structure modeling software that is able to incorporate all of the experimental information provided by structural proteomics and independently solve the structure. Human intervention is required in all cases. Again, selection of templates or fold recognition in the threading process can be achieved or influenced by a limited number of structural constraints (Young et al., 2000). However, true ab initio protein structure modeling would probably require
210
Evgeniy V. Petrotchenko and Christoph H. Borchers
numerous tight structural constraints. We envision several conceptual ways for accomplishing this. Multiple protein structural models could be generated and subsequently filtered for the selection of the “best” models, based on satisfying structural constraints derived from structural proteomics experiments. Alternatively, the modeling process itself can be somehow guided by incorporating structural constraints into some type of scoring function, influencing pathway of in silico folding process. Another possibility would be constraint-guided three-dimensional arrangement of the secondary structural elements to generate an initial fold pattern followed by refinement of the model. Protein modeling software programs such as Rosetta (Herzog et al., 2012) or NMR-based packages (Schwieters, Kuszewski, Tjandra, & Clore, 2003) can already use distance constraints as input data. Due to the chemical nature of the cross-linking process, however, only cross-links which can form on the surface of the protein, but not those penetrating the protein globule, should be used (Kahraman, Malmstr€ om, & Aebersold, 2011).
10. FUTURE DIRECTIONS Structural proteomics using mass spectrometry for the structural analysis of proteins has a bright and promising future. The widening availability of high-mass accuracy high-performance high-sensitivity instruments, such as Thermo’s Orbitrap, which greatly facilitate successful cross-linking applications, will allow the use of mass spectrometry-based structural proteomics by larger numbers of molecular biology researchers. Further development of the mass spectrometric techniques, instrumentation, and methods including top-down analysis, new fragmentation techniques, and gas-phase reactions are certain to continue to have a positive impact on this field. To provide more detailed structural information on proteins will require a new generation of nonselective modification reagents and short-range nonspecific cross-linking reagents. A collection of reagents of varying specificities and characteristics, specifically designed to facilitate downstream mass spectrometric analysis, as well as easy-to-use software for rapid processing of the data are also elements of the structural proteomics toolkit. Interpretation of the resulting structural data is also tightly linked to the development of protein modeling software. Easy-to-use protein modeling software programs which can easily and automatically incorporate and integrate distance and exposure information generated by structural proteomics experiments also needs to be developed. This will lead to rapid progress in this exciting field.
Modern Mass Spectrometry-Based Structural Proteomics
211
11. CONCLUSIONS In summary, mass spectrometry-based structural proteomics is already being successfully applied to many aspects of the structural analysis of proteins and protein complexes, including the analysis of protein structures and conformational changes, the determination of protein interaction interfaces, and for elucidating the topology of multisubunit protein complexes. Although not discussed in this review, the first examples of the application of structural proteomics techniques to the identification of proteome-wide protein interactions, have recently been presented (Herzog et al., 2012; Zheng et al., 2011). With the successful integration of multiple types of experimental data into the modeling process, we envision that the “holy grail” of structural proteomics—the autonomous solving of protein structures—will be achieved in the very near future.
ACKNOWLEDGMENT This work was supported by a Genome Canada, Genome British Columbia, Technology Development Grant.
REFERENCES Back, J. W., Notenboom, V., de Koning, L. J., Muijsers, A. O., Sixma, T. K., de Koster, C. G., et al. (2002). Identification of cross-linked peptides for protein interaction studies using mass spectrometry and 18O labeling. Analytical Chemistry, 74(17), 4417–4422. Buncherd, H., Nessen, M. A., Nouse, N., Stelder, S. K., Roseboom, W., Dekker, H. L., et al. (2012). Selective enrichment and identification of cross-linked peptides to study 3-D structures of protein complexes by mass spectrometry. Journal of Proteomics, 75(7), 2205–2215. Chen, X., Chen, Y. H., & Anderson, V. E. (1999). Protein cross-links: Universal isolation and characterization by isotopic derivatization and electrospray ionization mass spectrometry. Analytical Biochemistry, 273(2), 192–203. Chen, Z. A., Jawhari, A., Fischer, L., Buchen, C., Tahir, S., Kamenski, T., et al. (2010). Architecture of the RNA polymerase II-TFIIF complex revealed by cross-linking and mass spectrometry. EMBO Journal, 29(4), 717–726. Chowdhury, S. M., Du, X., Tolic´, N., Wu, S., Moore, R. J., Mayer, M. U., et al. (2009). Identification of cross-linked peptides after click-based enrichment using sequential collision-induced dissociation and electron transfer dissociation tandem mass spectrometry. Analytical Chemistry, 81(13), 5524–5532. Fasold, H., Klappenberger, J., Meyer, C., & Remold, H. (1971). Bifunctional reagents for the crosslinking of proteins. Angewandte Chemie International Edition in English, 10(11), 795–801. Fujii, N., Jacobsen, R. B., Wood, N. L., Schoeniger, J. S., & Guy, R. K. (2004). A novel protein crosslinking reagent for the determination of moderate resolution protein
212
Evgeniy V. Petrotchenko and Christoph H. Borchers
structures by mass spectrometry (MS3-D). Bioorganic and Medicinal Chemistry Letters, 14(2), 427–429. Green, N. S., Reisler, E., & Houk, K. N. (2001). Quantitative evaluation of the lengths of homobifunctional protein cross-linking reagents used as molecular rulers. Protein Science, 10(7), 1293–1304. Herzog, F., Kahraman, A., Boehringer, D., Mak, R., Bracher, A., Walzthoeni, T., et al. (2012). Structural probing of a protein phosphatase 2A network by chemical crosslinking and mass spectrometry. Science, 337(6100), 1348–1352. Kahraman, A., Malmstr€ om, L., & Aebersold, R. (2011). Xwalk: Computing and visualizing distances in cross-linking experiments. Bioinformatics, 27(15), 2163–2164. Konermann, L., Vahidi, S., & Sowole, M. A. (2014). Mass spectrometry methods for studying structure and dynamics of biological macromolecules. Analytical Chemistry, 86(1), 213–232. Konijnenberg, A., Butterer, A., & Sobott, F. (2013). Native ion mobility-mass spectrometry and related methods in structural biology. Biochimica et Biophysica Acta, 1834(6), 1239–1256. Kruppa, G. H., Schoeniger, J., & Young, M. M. (2003). A top down approach to protein structural studies using chemical cross-linking and Fourier transform mass spectrometry. Rapid Communications in Mass Spectrometry, 17, 155–162. Leitner, A., Reischl, R., Walzthoeni, T., Herzog, F., Bohn, S., F€ orster, F., et al. (2012). Expanding the chemical cross-linking toolbox by the use of multiple proteases and enrichment by size exclusion chromatography. Molecular and Cellular Proteomics, 11(3), M111.014126. Epub 2012 Jan 27. Marcoux, J., & Robinson, C. V. (2013). Twenty years of gas phase structural biology. Structure, 21(9), 1541–1550. Mayne, S. L., & Patterton, H. G. (2011). Bioinformatics tools for the structural elucidation of multi-subunit protein complexes by mass spectrometric analysis of protein-protein crosslinks. Briefings in Bioinformatics, 12(6), 660–671. Novak, P., & Giannakopulos, A. E. (2007). Chemical cross-linking and mass spectrometry as structure determination tools. European Journal of Mass Spectrometry, 13(2), 105–113. Novak, P., Young, M. M., Schoeniger, J. S., & Kruppa, G. H. (2003). A top-down approach to protein structure studies using chemical cross-linking and Fourier transform mass spectrometry. European Journal of Mass Spectrometry, 9(6), 623–631. Pan, J., Han, J., C. H. Borchers, C. H., & Konermann, L. (2008). Electron capture dissociation of electrosprayed protein ions for spatially resolved hydrogen exchange measurements. Journal of the American Chemical Society, 130(35), 11574–11575. Pan, J., Han, J., C. H. Borchers, C. H., & Konermann, L. (2009). Hydrogen/deuterium exchange mass spectrometry with top-down electron capture dissociation for characterizing structural transitions of a 17 kDa protein. Journal of the American Chemical Society, 131(35), 12801–12808. Paramelle, D., Miralles, G., Subra, G., & Martinez, J. (2013). Chemical cross-linkers for protein structure studies by mass spectrometry. Proteomics, 13(3–4), 438–456. Peters, K., & Richards, F. M. (1977). Chemical cross-linking: Reagents and problems in studies of membrane structure. Annual Review of Biochemistry, 46, 523–551. Petrotchenko, E. V., & Borchers, C. H. (2010a). Crosslinking combined with mass spectrometry for structural proteomics. Mass Spectrometry Reviews, 29, 862–876. Petrotchenko, E. V., & Borchers, C. H. (2010b). ICC-CLASS: Isotopically-coded cleavable crosslinking analysis suite. BMC Bioinformatics, 11(1), 64. Petrotchenko, E., Doant, T., & Borchers, C. (2006). A novel chromophoric affinity-tagged isotopically-coded crosslinker DGDNBS. Presented at the 54th ASMS Conference on Mass Spectrometry and Allied Topics, Seattle, WA.
Modern Mass Spectrometry-Based Structural Proteomics
213
Petrotchenko, E. V., Pedersen, L. C., Borchers, C. H., Tomer, K. B., & Negishi, M. (2001). The dimerization motif of cytosolic sulfotransferases. FEBS Letters, 490(1–2), 39–43. Petrotchenko, E. V., Serpa, J. J., Berjanskii, M., Suriyamongkol, B. P., Wishart, D. S., & Borchers, C. H. (2012). Use of proteinase K non-specific digestion for selective and comprehensive identification of interpeptide crosslinks: Application to prion proteins. Molecular and Cellular Proteomics, 11(7), M111.013524. Petrotchenko, E. V., Serpa, J. J., & Borchers, C. H. (2010). Use of a combination of isotopically coded cross-linkers and isotopically coded N-terminal modification reagents for selective identification of inter-peptide crosslinks. Analytical Chemistry, 82(3), 817–823. Petrotchenko, E. V., Serpa, J. J., & Borchers, C. H. (2011). An isotopically-coded CIDcleavable biotinylated crosslinker for structural proteomics. Molecular and Cellular Proteomics. 10(2). http://dx.doi.org/10.1074/mcp.M110.001420. Schwieters, C. D., Kuszewski, J. J., Tjandra, N., & Clore, G. M. (2003). The Xplor-NIH NMR molecular structure determination package. Journal of Magnetic Resonance, 160(1), 65–73. Serpa, J. J., Parker, C. E., Petrotchenko, E. V., Han, J., Pan, J., & Borchers, C. H. (2012). Mass spectrometry-based structural proteomics. European Journal of Mass Spectrometry, 18(2), 251–267. Serpa, J. J., Patterson, A. P., Pan, J., Han, J., Wishart, D. S., Petrotchenko, E. V., et al. (2013). Using multiple structural proteomics approaches for the characterization of prion proteins. Journal of Proteomics, 81, 31–42. Serpa, J. J., Petrotchenko, E. V., Wishart, D. S., & Borchers, C. H. (2013). Using isotopically-coded hydrogen peroxide as a surface modification reagent for the structural characterization of prion-protein aggregates. Presented at the 61st ASMS Conference on Mass Spectrometry and Allied Topics, Minneapolis, MN. Sohn, C. H., Agnew, H. D., Lee, J. E., Sweredoski, M. J., Graham, R. L., Smith, G. T., et al. (2012). Designer reagents for mass spectrometry-based proteomics: Clickable crosslinkers for elucidation of protein structures and interactions. Analytical Chemistry, 84(6), 2662–2669. Taverner, T., Hall, N. E., O’Hair, R. A. J., & Simpson, R. J. (2002). Characterization of an antagonist interleukin-6 dimer by stable isotope labelling, cross-linking and mass spectrometry. Journal of Biological Chemistry, 277(48), 46487–46492. Wang, B., & Hakansson, K. (2008). Design and evaluation of a novel homobifunctional cross-linker with selective metal dioxide-based enrichment potential. Presented at the 56th ASMS Conference on Mass Spectrometry and Allied Topics, Denver, CO. Yan, F., Che, F. Y., Rykunov, D., Nieves, E., Fiser, A., Weiss, L. M., et al. (2009). Nonprotein based enrichment method to analyze peptide cross-linking in protein complexes. Analytical Chemistry, 81(17), 7149–7159. Yang, L., Zheng, C., Weisbrod, C. R., Tang, X., Munske, G. R., Hoopmann, M. R., et al. (2012). In vivo application of photocleavable protein interaction reporter technology. Journal of Proteome Research, 11(2), 1027–1041. Young, M. M., Tang, N., Hempel, J. C., Oshiro, C. M., Taylor, E. W., Kuntz, I. D., et al. (2000). High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proceedings of the National Academy of Science USA, 97, 5802–5806. Zheng, C., Yang, L., Hoopmann, M. R., Eng, J. K., Tang, X., Weisbrod, C. R., et al. (2011). Cross-linking measurements of in vivo protein complex topologies. Molecular and Cellular Proteomics, 10(10), M110.006841.