Mass Spectroscopy of Proteins 681
Mass Spectroscopy of Proteins E V Romanova, S P Annangudi, and J V Sweedler, University of Illinois, Urbana, IL, USA ã 2009 Elsevier Ltd. All rights reserved.
Introduction Mass spectrometry (MS) is an analytical technique for the determination of the composition of chemical substances by measuring accurate molecular masses. In MS analysis, the molecules of interest are vaporized and ionized, and the mass-to-charge (m/z) ratios of molecular ions are determined. When these gasphase ions are broken down into characteristic fragments, the measurement of the masses of the fragments provides information on the original ion’s molecular structure. Although MS has been used to characterize small molecules for almost a century, the ability to vaporize and ionize larger biopolymers such as proteins was limited, preventing the application of MS to many protein characterization tasks. However, breakthrough developments in soft ionization methods, acknowledged by the Nobel Prize to K Tanaka and JB Fenn in 2002, advanced the application of MS to protein analysis. By combining liquid-phase separations to fractionate complex samples with modern MS measurement techniques, a wide range of peptide and protein characterization efforts are now possible. In recent decades, MS has developed into a highly sensitive, accurate, and powerful tool for rapid qualitative and semi-quantitative characterization of peptides and proteins present in samples ranging from individual cells, tissue sections, and biological fluids to entire organisms. This article provides a brief overview of the capabilities of MS for protein measurements, with an emphasis on neuroscience-related samples. Throughout the article, illustrative references provide additional information.
Principles of Mass Spectrometric Analysis and Instrumentation MS analysis consists of two essential steps: (1) a process to vaporize and ionize analytes of interest from a sample, followed by (2) a means to measure the masses of the resulting ions. The most effective ionization of proteins is achieved by soft ionization methods such as electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) (Figure 1). The ionization of biological samples by both ESI and MALDI imparts low internal energies on to ionized molecules, limiting their fragmentation/breakdown so
that larger molecules can be studied. Briefly, in MALDI a liquid or solid biological sample is co-crystallized with a large excess of ultraviolet (UV)-absorbing matrix; illumination of the matrix-sample crystals by a UV laser results in a microexplosion that produces primarily singly charged gaseous analyte ions. In ESI, electric potential is applied to a capillary filled with a liquid sample. This generates an electrospray consisting of small liquid droplets containing the sample, which are desolvated and form multiply charged gas-phase ions. Unlike a MALDI mass spectrum, an ESI mass spectrum of an individual peptide or protein contains a number of signals corresponding to different charge states of each analyte. The two approaches are complementary and can be used for the same samples. However, differing capabilities and figures of merit often suggest one approach may be more suited to a particular sample type than another. As an example, samples containing high levels of salts, or those that require direct measurement of compounds from a tissue, may be better suited to MALDI approaches, whereas liquid-phase purifications of biological samples may be more easily interfaced to an ESI MS system. Both ionization approaches are often combined with triple quadrupole, ion trap, Fourier transform (FT) ion cyclotron resonance (ICR) or time-of-flight (TOF) mass analyzers (see Box 1), each of which has particular advantages for protein identification. The wide range of current mass spectrometric instrumentation provides a choice of tools for a variety of applications in life sciences.
Protein Identification and Structure Determination Identifying the protein components of specific brain regions and individual nerve cells provides a basis for understanding numerous neurobiological phenomena. MS methodologies have the capability of identifying proteins by directly deducing their primary structure. The most common approach, known as peptide mass fingerprinting or a bottom-up measurement, involves digestion of a purified protein of interest or a simple protein mixture into smaller peptides using enzymes such as trypsin. The masses of the resulting peptide ions are measured, compared to predicted masses of proteolytic peptides from a database (which often are unique for a given protein), and mapped to protein sequence stretches that match experimentally observed peptide masses (Figure 2). Peptide mass fingerprinting alleviates the need for characterizing high-molecularweight proteins and makes protein analysis amenable
682 Mass Spectroscopy of Proteins
Sample plate
Pulsed laser N2 gas Capillary spray
Sample solution
+ a
High voltage
b Sample – matrix crystals
Figure 1 Two ionization techniques commonly used in biological mass spectroscopy: (a) electrospray ionization (ESI); (b) matrixassisted laser desorption/ionization (MALDI). As shown in (a), ESI is an ionization technique that uses high voltage at a capillary tip to disperse and spray solvated analytes from the capillary. The spray produces multiply charged ions that are transferred directly into a mass analyzer. As shown in (b), MALDI is a technique that uses a laser to irradiate analytes co-crystallized with a matrix that absorbs the laser energy and forms singly charged gaseous analyte ions.
Detector
Detector
Box 1 The mass analyzers used for MS-based characterization
a
b Superconducting magnet
Detector
ICR cell
c
Ion trap
d
Ion path
The mass analyzer is the part of the measurement process that subjects the ions to a series of electric or magnetic fields in order to allow determination of the m/z of the ions. As different mass analyzers offer different figures of merit (measurement speed, mass resolution, mass range, dynamic range), a brief understanding of these differences can help the user select the appropriate instruments for a specific measurement task. The following briefly describes and illustrates four common mass analyzers: (a) In quadrupole mass analyzers, the ions generated from either of the ionization techniques (ESI or MALDI) pass through a set of four rods that have DC and alternating RF potentials applied to them that focus the ions and allow selected ions to reach the detector. In a triple quadrupole instrument, the ions pass through successive sets of quadrupoles, with each set used for ion selection, ion focusing, and fragmentation, respectively, before they reach the detector. (b) Quadrupole ion trap instruments use hyperbolic-shaped electrodes to confine ions to stable trajectories inside a specific region, the ion trap, with the potentials applied to these electrodes set depending on the m/z range of the ions to be measured. The confined ions can then be either ejected into the detector or dissociated by collisions prior to remeasurement or ejection. (c) In TOF analyzers, the ions are formed in the ionization source and accelerated by an applied potential so that all ions have the same kinetic energy before being introduced to the field-free region of the mass analyzer. Thus, ions travel at different velocities related to the kinetic energy of the ion. The flight time that each ion spends traveling in the field-free region is converted to an m/z value. Conceptually, the TOF instrument is a simple mass analyzer that has the ability to measure ions over a wide mass range. (d) The operational principle of FT-ICR is to subject ions to a uniform, high magnetic field while in the ICR cell; this field causes the ions to orbit in circular orbits (e.g., cyclotron motion). Because a large number of orbits can be measured, the frequency of ion motion can be measured accurately and precisely. These frequencies are converted to m/z values using the Fourier transform algorithm. FT-ICR offers unparalleled mass accuracy and resolution.
Mass Spectroscopy of Proteins 683 10 20 30 40 50 60 GLSDGEWQQV LNVWGKVEAD IAGHGQEVLI RLFTGHPETL EKFDKFKHLK TEAEMKASED 70 80 90 100 110 120 LKKHGTVVLT ALGGILKKKG HHEAELKPLA QSHATKHKIP IKYLEFISDA IIHVLHSKHP 130 140 150 GDFGADAQGA MTKALELFRN DIAAKYKELG FQG
134–139
Relative intensity
17–31
97–102
32–42
148–153 51–56
80–96
134–145
119–133
103–118 79–96
64–77
43–47
146–153 48–56
600
1–16
800
1000
32–45
1200
1400
1600
1800
2000
m/z Figure 2 Peptide mass fingerprint of myoglobin by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectroscopy. Mass spectrum shows multiple peaks that represent peptides resulting from proteolytic digestion of purified horse myoglobin standard. Peptide peaks are labeled by position of the peptide amino acids in the protein sequence. Identities of measured peptides are often assigned using bioinformatics approaches such as MASCOT. This program compares the detected masses in the spectra to predicted masses of proteolytic peptides from the protein database and determines which proteins best match statistically with the experimental data. Sequence stretches underlined in blue denote the peptide identified in the spectrum, and red denotes the sites of trypsin cleavage. In this example, the resulting tryptic peptides cover 94% of the protein sequence. m/z, mass-to-charge ratio.
to several types of mass spectrometers. However, if a crude extract containing numerous proteins is digested, it places severe demands on the separation and analysis system; a sample that initially contained hundreds of protein compounds contains tens of thousands of proteolytic peptides after digestion. Existing knowledge on gene expression in the biological tissue or organism under analysis allows identification of proteins by peptide fingerprinting when combined with database searches. Numerous search algorithms are available that rank the recovered protein sequences according to their probability of producing a peptide map that matches the experimentally observed peptides. As the number of organisms with known genomes increases, peptide fingerprinting is proving to be an excellent verification tool in protein expression studies.
Instrumentation platforms that couple two or more mass analyzers in succession, known as tandem mass spectrometry (MS/MS), enable identification of a protein by fragmenting the analyte to allow the determination of its partial or complete amino acid sequence. For example, when analyzing a protein digest by MS/MS, one of the proteolytic peptides can be chosen after the first MS measurement and its molecular ions selectively fragmented by a variety of physical approaches for molecular dissociation. Ideally, fragmentation produces a ladder series of fragment ions (truncated forms of the initial peptide ions) that are analyzed by a second mass analyzer. Mass differences between fragment ions provide information on the arrangement of the amino acids within the fragmented peptide. The deduced peptide sequence, in turn, can be statistically matched to a specific part of a protein
684 Mass Spectroscopy of Proteins
in the database. The ability to sequence peptides by fragmentation minimizes the number of proteolytic peptides needed for confident protein assignment, which extends the use of the mass fingerprinting approach to identification of proteins in more complex mixtures, as well as to proteins that produce fewer proteolytic peptides. Sequencing of peptides from unknown proteins (de novo sequencing) may lead to the characterization of new gene products and annotation of new genes. With the advent of high-resolution mass spectrometers, a second, newer approach to protein identification has been developed, referred to as top-down MS. Unlike peptide mass fingerprinting, top-down analysis begins with direct MS detection and sequencing of an intact protein and lays a foundation for
full structural protein characterization. The sequencing of intact proteins is achieved by producing various truncated forms of an entire protein ion by fragmentation inside the mass spectrometer (Figure 3). Thus far, FT-ICR MS appears well suited for topdown protein analysis because it has the required high resolving power of mass measurement and sensitivity. In addition, highly efficient fragmentation techniques that yield complementary information on whole protein fragment ions have been developed specifically for FT-ICR MS. For protein assignment, the data obtained by top-down MS/MS can be processed by search engines in a manner similar to that of the bottom-up approach already described. Although highly accurate and effective in protein identification, the top-down approach is challenging when routinely
Posttranslational modifications
Intact protein ion
Direct physical fragmentation
Protein fragment ions
Mass measurement
m/z High-resolution mass spectrum for analysis of mass shifts Figure 3 Schematic of the top-down approach for protein sequencing and identification. First, the mass of the whole protein ion is determined. Protein ions are then fragmented within the mass spectrometer and the accurate mass of each of the resulting fragments is measured. Patterns of mass shifts between fragment ions are indicative of the sequence in which amino acids are arranged in the original molecular ion. Analysis of fragment ion patterns allows reconstruction of the protein sequence from both or either terminus. Known variations of mass shifts in fragment ion patterns elucidate the presence of posttranslational modifications. Identification of the deduced protein sequences requires database searches. m/z, mass-to-charge ratio.
Mass Spectroscopy of Proteins 685
handling proteins >50 kDa, crude samples with >200 proteins, or complex posttranslational modifications such as glycosylation.
Characterization of Protein Posttranslational Modifications and Differential Gene Expression According to a finding of the Human Genome Project, expressed protein forms outnumber the protein-coding genes by an order of magnitude (20 000– 25 000 genes vs. 400 000 proteins). Protein diversity that begins with a single nucleotide polymorphism, cell- or tissue-specific alternative splicing, or editing of mRNAs encoded by one gene can be increased further by the addition or removal of various chemical groups to or from the primary protein structure after translation. These chemical changes, known as posttranslational modifications (PTMs), result in alterations of inherent molecular properties such as charge or conformation. These changes often confer unique bioactivity to the modified protein. MS is ideal for the discovery and characterization of PTMs on proteins because most PTMs result in characteristic mass changes to the original molecular ion. For example, phosphorylation of a protein molecule at serine, threonine, tyrosine, or aspartate results in a mass shift of 80 Da due to the addition of a phosphate group. In many cases, comparing calculated and measured molecular masses of intact ions can be sufficient to determine whether a protein or peptide molecule carries a PTM. With the help of the fragmentation techniques already described, the location of an anticipated PTM can be mapped to a particular amino acid residue within the protein by analyzing mass shifts among the fragment ions generated in an MS/MS spectrum. MS is also effective for the study of protein isoforms resulting from differentially expressed genes. Minor changes in a protein’s primary structure associated with alternative splicing or editing of the proteinencoding mRNA lead to changes in the molecular masses of the original protein ion. Top-down protein identification by FT MS is especially effective in this case because various direct fragmentation techniques yield complementary fragment ions that often provide 100% protein sequence coverage and localize sites of basic PTMs.
Quantitative Analysis of Protein Expression MS approaches to quantify protein levels from two or more complex mixtures use the ability of MS to differentiate isotopically distinct but chemically identical analytes. The technique is based on labeling
each sample using isotopic labels, thus conferring a mass change that is detected using MS (Figure 4). Various labeling approaches have been developed for quantitative analysis of protein expression in biological samples, with isotopic labels, incorporated via metabolic, enzymatic, or chemical reaction, being the most widely used. In metabolic labeling, isotopes such as N15 or C13 are incorporated into cultured cells via a culture medium that is enriched with amino acids containing stable isotopes in light and heavy forms. The intensity ratio of the peptide peaks labeled with light and heavy isotopes in a mass spectrum reflects the relative abundance of the two peptide species in the sample (Figure 4(c)). Alternatively, isotopes can be introduced at particular places on protein molecules via specific chemical or enzymatic reactions with isotopically encoded reagents. The two MSbased quantitative reagents that have rapidly gained popularity are isotope-coded affinity tags and isobaric tags for relative and absolute quantification. Strategies for MS determination of the absolute amounts of selected proteins in a sample rely on the addition of known amounts of isotopically labeled recombinant or synthetic internal standards of native proteins found in the sample. When added to a protein tryptic digest, such standards undergo trypsin treatment along with the endogenous proteins. The resulting tryptic peptides are then detected in a mass spectrometer as pairs of labeled and unlabeled identical peptides, differing by the mass of the known isotopic tag. The relative abundance of the endogenous proteins in the original sample is deduced from the ratio between the peak intensities of each peptide pair.
Applications of Mass Spectrometry in Functional Proteomics Dynamic molecular networks, which often include supramolecular assemblies of noncovalent protein– protein, protein–DNA, and protein–ligand complexes, regulate central neurophysiological processes such as signal transduction and plasticity. Although the structural analysis of macromolecular formations presents a significant analytical challenge, MS becomes increasingly useful in mapping protein interactions. One MS-based approach for studying protein complexes uses the ability of ESI to preserve noncovalent interactions during analysis. In contrast to biophysical techniques such as X-ray crystallography and nuclear magnetic resonance spectroscopy (NMR), MS has the ability to detect different species simultaneously, often at lower concentrations. Using MS, it is possible to measure the mass of the entire complex, as well as the individual subunits produced by molecular collisions inside a tandem mass spectrometer. MS measurements
686 Mass Spectroscopy of Proteins Sample B
Sample A
Protein digestion
Isotope labeling
Mass Spectrometry for Biomarker Discovery
Mix Heavy isotope label Light isotope label
Optical density
a
Time b Proteolytic peptides Relative intensity
A D
C
B
help deduce the stoichiometry of the components making up the complex and may reveal the presence of various metals in metalloproteins. The energy (voltage) required for dissociation of each particular complex serves as a relative measure of complex stability and interaction strength. Because MS can be applied to mapping protein interactions on a global scale, it may also be considered an improved alternative to yeast two-hybrid screening, co-immunoprecipitation, and affinity chromatography – methods used extensively in neuroscience.
The ability to measure simultaneously numerous protein components in a sample, and to differentiate between proteins and peptides with PTMs, makes MS well suited for large-scale comparisons of protein expression profiles in search of protein biomarkers that establish links to normal or pathogenic health conditions, disease progression, or therapeutic responses. The power of MS in biomarker discovery is that, in contrast to biomarker screening methods based on the use of selective probes, MS can be used without prior information on the form of the resulting biomarker. Moreover, instead of tracing changes in just a few particular proteins targeted by probes, MS screening can detect unexpected sets of diverse molecular changes that encompass a range of functional pathways in affected individuals. MS screening can be performed on a limited amount of tissue extract or body fluid and, in some cases, without extensive sample purification. Furthermore, MS profiling can be helpful in diagnosing conditions that include genetic abnormalities such as progressive mutations, as well as identifying PTMs. Current discovery efforts are focused on cancer, heart disease, Alzheimer’s disease, and rheumatoid arthritis markers.
m/z Heavy isotope labeled
100
%
0
This article describes the capability of MS to detect, structurally characterize, identify, compare, and
Light isotope labeled
573
574
575
Mass Spectrometric Measurements in the Brain
576
577
578
c Figure 4 Protein quantitation using stable isotope labeling and liquid chromatography (LC) mass spectroscopy: (a) labeling and mixing samples A and B; (b) reverse phase liquid chromatography separation (RP-LC); (c) mass spectrometry. In (a), proteins from two different samples (A and B) are isolated and digested using a proteolytic enzyme. Resulting proteolytic peptides are labeled using either heavy (sample A) or light (sample B) isotopic modifiers and
combined. In (b), this mixture is fractionated by LC. Identical peptides originating from samples A and B, now carrying heavy or light isotope labels, respectively, coelute in the chromatogram. Coeluting, identically labeled peptides differ by Dm, which equals the mass difference between heavy and light isotope labels. Therefore, identical peptides from different samples can be resolved and analyzed using a mass spectrometer (c). The differences in peak intensities of the labeled peptides reflect the differences in the peptide amounts in the original samples. The lower trace in (c) shows an example of a high-resolution zoom-in view of a typical quantitative mass spectrum. m/z, mass-to-charge ratio.
Mass Spectroscopy of Proteins 687
quantify compounds in biological samples at different organismal levels, facilitating investigations in many subfields of neuroscience. For instance, analysis of proteins at the system level is challenging because of the inherent complexity of the brain. Fractionation techniques such as two-dimensional gel electrophoresis, liquid chromatography, and affinity purifications are commonly used in conjunction with MS to reduce the chemical complexity of such samples and to enhance investigation at the protein level. Profiling of protein expression directly in brain tissues or brain extracts contributes to comparative proteomic analyses in normal as well as pathological conditions, including neurodegeneration, neurotrauma, psychiatric diseases, drug addiction, and nervous system tumors. An alternative approach is to decrease anatomical complexity and perform the analysis at the level of the individual neuron or even subcellular compartments. Currently, single-cell MS is effectively being used to map the cellular localization of novel peptide transmitters and analyze neuropeptide release at identified release areas. Combined with electrophysiology and molecular biology, single-cell MS has become an indispensable tool for delineating neuronal circuits that regulate simple behaviors in model invertebrates. A newer application of MS in neuroscience is imaging mass spectrometry. Unlike other imaging techniques, such as immunohistochemistry and autoradiography, MS imaging simultaneously detects numerous compounds without the use of specific molecular probes and labels. Imaging MS becomes increasingly practical for mapping the spatial distribution of peptides and proteins and their structural variances directly in brain tissue and even individual neurons. Currently, MS is a mainstream neuroproteomic technology well suited for studying protein function.
Progress in sampling and separation methodologies, engineering, microfabrication, and bioinformatics continues to advance MS-based approaches for the investigation of protein structure and function in the central nervous system. See also: Fluorescent Biomarkers in Neurons; Neuroproteomics.
Further Reading Bogdanov B and Smith RD (2005) Proteomics by FTICR mass spectrometry: Top down and bottom up. Mass Spectrometry Reviews 24: 168–200. Caldwell RL and Caprioli RM (2005) Tissue profiling by mass spectrometry: A review of methodology and applications. Molecular and Cellular Proteomics 4: 394–401. Caprioli RM (ed.) (2004) The Encyclopedia of Mass Spectrometry: Biological Application, A. New York: Elsevier Science. Choudhary J and Grant SG (2004) Proteomics in postgenomic neuroscience: The end of the beginning. Nature Neuroscience 7: 440–445. Kalia A and Gupta RP (2005) Proteomics: A paradigm shift. Critical Reviews in Biotechnology 25: 173–198. Krogan NJ, Cagney G, Yu H, et al. (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440: 637–643. Li L, Garden RW, and Sweedler JV (2000) Single-cell MALDI: A new tool for direct peptide profiling. Trends in Biotechnology 18: 151–160.
Relevant Websites http://www.asms.org – American Society for Mass Spectrometry (tutorial). http://www.i-mass.com – International Mass Spectrometry Web Resource. http://www.ionsource.com – Mass Spectrometry and Biotechnology resources.