Analysis and Interpretation of Mass Spectrometry Imaging Datasets

Analysis and Interpretation of Mass Spectrometry Imaging Datasets

CHAPTER THIRTEEN Analysis and Interpretation of Mass Spectrometry Imaging Datasets Markus de Raad*,†, Trent R. Northen*,†, Benjamin P. Bowen*,†,1 *En...

329KB Sizes 0 Downloads 41 Views

CHAPTER THIRTEEN

Analysis and Interpretation of Mass Spectrometry Imaging Datasets Markus de Raad*,†, Trent R. Northen*,†, Benjamin P. Bowen*,†,1 *Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States † The Joint Genome Institute, Lawrence Berkeley National Laboratory, Walnut Creek, CA, United States 1 Corresponding author: e-mail address: [email protected]

Contents 1. 2. 3. 4.

Introduction and Background Commonly Used MSI Software Data Types Data Analysis 4.1 Preprocessing (Data Cleaning: Noise Reduction, m/z Calibration) 4.2 Normalization and Quantification in MSI 4.3 Untargeted Analysis: Feature Detection, Multivariate Statistics 4.4 Compound Identification 4.5 Stable Isotope Labelling: Isotopic Enrichment Calculations 4.6 Analyzing Spatial Defined Samples in MSI Datasets 5. Future Directions Acknowledgements References

369 371 371 374 374 376 377 380 381 381 382 383 383

1. INTRODUCTION AND BACKGROUND Mass spectrometry imaging (MSI) enables examination of the localization of biological molecules. This information is often extremely valuable in the context of the 3D architecture of the biospecimen and can even enable analysis of low-abundance molecules obscured in bulk analyses. This approach has been used to study the spatial distribution of biomolecules within tissues, in biofilms, in chemical arrays and other biological media [1,2]. Numerous experimental MSI platforms have been developed and are commercially available, including desorption electrospray ionization, laser ablation electrospray ionization, matrix-assisted laser desorption/ionization Comprehensive Analytical Chemistry, Volume 82 ISSN 0166-526X https://doi.org/10.1016/bs.coac.2018.06.006

#

2018 Elsevier B.V. All rights reserved.

369

370

Markus de Raad et al.

(MALDI), secondary-ion mass spectrometry, laser ablation-inductively coupled plasma, nanowire-assisted laser desorption ionization and nanostructure initiator mass spectrometry [3–8]. MSI image generation is uniform for all MSI techniques despite the raw data being dictated by the specific MSI platform (ion desorption/ionization, ion analysis and detection). In MSI, mass spectra are collected at discrete spatial points, usually in a raster format, generating a so-called datacube with a mass spectrum measured at each coordinate. In order to go from collected spectra to an interpretable MSI image, the raw data needs to be processed. Data analysis is a significant challenge in MSI given the size of the files (e.g. 30 GB), which makes them difficult to analyse and compare. A broad range of software tools exists to help researchers go from raw spectra to an analyzable MSI image. In this chapter, we will give an overview of existing software for MSI processing, molecular image generation and describe several frequently used analytical algorithms/tools for data analysis (Fig. 1).

Fig. 1 Overview of the workflow in mass spectrometry imaging. Samples are prepared and mounted on a target surface. Either manufacture or user software is used for controlling the mass spectrometer by defining the raster grid/sampling path across the surface of the sample. Next, the mass spectrometer collects mass spectra from unique locations across the sample surface in an automated manner, with each sampling location containing a unique mass spectrum. The acquired mass spectra are combined and processed using software algorithms to create an analyzable MSI image.

Mass Spectrometry Imaging Datasets

371

2. COMMONLY USED MSI SOFTWARE To create images from MSI spectral data, the cumulative or maximum intensity over a selection of ions is calculated for each position in the dataset, where, for example, each ion is represented by a unique colour. Typically, MSI data is never stored as images and only as spectra as is described earlier. However, there are at least two examples showing that storing duplicate data, both spectral aligned and image aligned can accelerate MSI workflows [9,10]. These tools can extract images from the data thousands of times faster than the imzML or Analyze 7.5 format [9]. However, being less common, one would typically use an imzML file as a bridge between the proprietary vendor formats and these less widespread formats. There are numerous software applications available for MSI, and these tools allow you to open your MSI file, view spectra and view ion intensity images. Major differences between the tools exist for how large a file can be opened and how fast images can be shown. A listing of the most common tools for MSI analysis is shown in Table 1. As can be seen, the majority of these tools require a Microsoft Windows operating system to run and are publicly available. Two of the tools are built in MATLAB®, a commercial programming language for scientific data analysis. Although a repository for exclusively warehousing and serving raw MSI data to accompany manuscripts has yet to be built, the PRIDE Archive is capable of meeting the technical requirements for MSI data [19]. One can search and freely download MSI datasets for numerous, published MSI studies. In addition, OpenMSI and MetaSpace allow sharing of datasets via the web. OpenMSI facilitates interactive viewing of raw images and spectra [10]. MetaSpace facilitates the prediction of compound identifications and sharing of images for these assignments [20]. Like other high information content analysis techniques, such as sequencing, proteomics and transcriptomics, MSI is developing reporting standards to make this process more routine and a necessary part of the publication process.

3. DATA TYPES MSI data are acquired one spectrum at a time. Consequently, even though we think of MSI as an imaging method, data are acquired and stored as individual spectra. As mass spectrometry has inherently multimodal capabilities (e.g. switching instrument polarities, data-dependent fragmentation,

372

Markus de Raad et al.

Table 1 Software Available for MSI Data Analysis Commercial or Name Open Access References Website

SpectViewer Beta-version available for academic labs by request DataCube Explorer

[11]

[12] Free noncommercial community license

Operating System

MS-Windows https:// ms-imaging.org/ wp/imzml/ software-tools/ cea-spect-viewer/ https://amolf.nl/ download/ datacubeexplorer

MS-Windows

MSiReader Open-source license (BSD 3)

[13,14]

http://www4. ncsu. edu/dcmuddim/ downloads.html

Cross platform (MATLAB® requirement on Linux and MacOS)

Omnispect

Source code is available. License is unspecified

[15]

https://github. com/rmparry7/ omnispect

Cross platform (MATLAB® requirement)

OpenMSI

Open access. Account registration required for private data

[10]

https://openmsi. nersc.gov

Cross platform

Cardinal

Artistic-2.0 license [16]

Cross platform http:// bioconductor.org/ packages/release/ bioc/html/ Cardinal.html

msIQuant

[17] A license key is required to run msIQuant and is generated for free after applying a license key request

https://msimaging.org/wp/ paquan/

MS-Windows

Spectral analysis

Apache license 2.0 [18]

https://github. com/AlanRace/ SpectralAnalysis

MS-Windows and MATLAB®

Quantinetix Commercial

Imabiotech https://www. imabiotech.com/

MS-Windows

SCiLS Lab

SCiLS

MS-Windows

Commercial

http://scils.de/

Mass Spectrometry Imaging Datasets

373

etc.), there can be many necessary attributes to store for each spectrum and include polarity, fragmentation characteristics, lab-time, position and potentially many others. Although in MSI, each spectrum is typically acquired under the same conditions, this is not necessarily always the case. To manage the attributes associated with the plurality of modes that spectra can be acquired in, mass spectrometry data from all fields (proteomics, metabolomics, imaging, etc.) are often stored in XML-formatted files. These XML files use standard vocabularies to capture descriptions of the spectral attributes and store the spectra in a binary format inline with the text-based XML. Currently, there are many XML formats used in mass spectrometry, where mzXML, mzData and mzML are the most widely used. Converters are required to translate data files acquired using proprietary formats from the instrument vendors [21]. As one would expect, MSI has similar requirements common to all mass spectrometry applications. To meet this need, MSI files are typically converted from a proprietary vendor defined format into a format called imzML [21]. These files are also XML files, but a distinct difference to XML formats used in other areas of mass spectrometry is that the spectral data are stored in a separate binary file called an ibd file. Even though the imzML format is gaining widespread popularity, there are still many other formats commonly used in MSI. This is partly due to the fact that many research groups have instruments and workflows that predate the imzML format, with the most widespread being the Analyze 7.5 format. This format differs from the standard format of Analyze 7.5 files used to store magnetic resonance imaging data. The difference is small, but enough that non-MSI Analyze 7.5 readers will not work on MSI data. In the MSI field, the Analyze 7.5 format contains three files: an img file that stores binary spectral data, an hdr file that stores the number of pixels in x and y dimensions and a t2m file that stores the m/z values for each spectrum. The imzML format is a hybrid between the Analyze 7.5 format and other XML-based commonly used formats for mass spectrometry. Like the Analyze 7.5 img file format, a ‘blob’ of spectral data is stored as a separate binary file (the ibd file). This facilitates rapid slicing of spectra from the file by jumping to a specific location in the file and reading one or more spectra. Like the widely used XML files found in other mass spectrometry applications, a wide variety of information can be captured in a humanreadable, text-based XML format for each spectrum (the imzML file). A standard vocabulary shared with the mzML format, but with additional terms that define the x, y and z position as well as the sample orientation is used in imzML.

374

Markus de Raad et al.

4. DATA ANALYSIS MSI analysis can be broadly defined into two categories depending on the nature of the research: these are generally referred to as targeted vs untargeted analysis. In targeted analyses, a small number of molecules are being investigated. Typically, it is desirable to measure the levels of the targeted molecules and to be certain about their identities. In comparison, untargeted analyses are often focused on observation of a broad range of molecules. The levels of molecules in an untargeted experiment are typically presented in a relative manner (as opposed to absolute), and identities are typically far less certain. Although there is a wide variety of approaches to analyse MSI datasets, below are contextual descriptions of the most widely used approaches.

4.1 Preprocessing (Data Cleaning: Noise Reduction, m/z Calibration) Independent of specific downstream analysis methods, MSI data analysis starts with the application of several different tools known as preprocessing, including baseline subtraction, noise correction and spectrum smoothing. Preprocessing moves raw MSI data into a refined state much more amenable to interpretation and analysis. Raw MSI spectra may contain different sources of background/noise, including background generated during the MS process (e.g. in MALDI, chemical noise from matrix ions) and position-dependent background ion signals. This is important because background/noise complicates may be misinterpreted as relevant peaks in the sample, specifically in MSI datasets. Peak picking, peak detection or centroiding all refer to preprocessing methods for background/noise reduction by selecting m/z-values corresponding to relevant peaks in the sample (Fig. 2) [23]. In addition to background/noise, MSI spectra may contain baseline noise (dark counts) that must be removed to enable effective feature extraction and selection. The baseline of a spectrum is defined by the point of zero intensity and should mirror the signal observed as if no sample were present. Most MSI spectra lack a clear baseline, and a wide suite of different algorithms exists for baseline correction and baseline subtraction [24–26]. A rank-preserving scaling approach for baseline subtraction is sorted mass spectrum transform, which sorts and scales intensity values, cutoff values below specified criteria and rescales the

Mass Spectrometry Imaging Datasets

375

Fig. 2 Peak finding. Peak finding (also known as centroiding) is a widely used technique in mass spectrometry where raw, profile-mode, spectra (A) are transformed into peak centroids (B). A peak cube can be created from these m/z centroids, collectively stored as a stack of images (one for each m/z centroid). The MSI dataset used can be found in OpenMSl [22].

values back to a meaningful intensity level [27]. The majority of commercial data acquisition software either automatically applies baseline subtraction or offers one of a limited number of options [25,26,28]. An opensource approach that combines peak-picking and baseline subtraction is the Top Hat algorithm, but this is generally used for low-resolution data where peaks are not clearly identifiable [29,30]. Data calibration is another very important issue. Since acquisition of MSI data often requires many hours of continuous spectra acquisition and samples lack uniformity, both spatially and chemically, mass calibration can shift during acquisition, despite the prior calibration of the mass spectrometer [31]. The mass accuracy of a MSI dataset can be improved through spectral recalibration, and several different recalibration approaches have been reported, including methods for mass error correction which are not dependent on the presence of an internal or external calibrant [31–33]. These processing steps are used to generate a peak cube, which is the collection of ion images for centroid-m/z values. The peak has dimensions of x, y and m/z. Most MSI datasets have hundreds of x,y coordinates and a few thousand peak centroids. The peak cube can be sliced in m/z to obtain ion

376

Markus de Raad et al.

images, and for a given x,y coordinate, a spectrum can be obtained. Most notably, the transformation of raw data to a peak cube lays the foundation for further processing.

4.2 Normalization and Quantification in MSI Normalization of MSI datasets is often required to minimize spectrum-tospectrum differences in peak intensity in order to compare spectra. The goal of normalization is to make the MSI dataset independent of variation between each individual spectrum. Normalization with respect to an internal standard with a known concentration would be ideal, as it enables accurate calculation of unknown ion abundances on a spectrum-by-spectrum basis. Typically this is not used during initial imaging experiments and normalization methods are based on rescaling relative to some property that can be measured from spectra [23]. The most commonly applied normalization procedure in mass spectrometry is the normalization on the total ion count (TIC), where all mass spectra are divided by their TIC so that all spectra in a dataset have the same integrated area under the spectrum. Other applied normalization strategies in MSI are median and root mean square normalization [27,34]. As an example of normalization, Fig. 3 shows a raw MSI dataset, which was normalized using three different methods. The first being TIC normalization; the second being TIC normalization followed by contrast enhancement by the 1st and 99th percentile of intensities and the last being TIC normalization followed by contrast enhancement by the 5th and 95th percentile of intensities. Normalization is especially important for relative and absolute quantification in MSI. The simplest form of normalization is to calculate ratios of ions corresponding to metabolites with similar desorption/ionization characteristics (e.g. phospholipids). However, it is often desirable to make use of analytical standards, when these are available. In relative quantification, a fixed amount of an internal standard is added to the sample, and peak heights or areas of individual endogenous analytes are measured relative to it. This only works if the samples run under identical conditions, at the same time, and all containing the same amount of internal standard. For absolute quantification, in addition to an internal standard, a calibration curve has to be prepared of the same (fixed) amount of the internal standard and varying amounts of a single specific analyte of interest [35]. Using the standard curve, the constant of proportionality for one analyte can be established

Mass Spectrometry Imaging Datasets

377

Fig. 3 Normalization. A red, green and blue image using three spatially distinct ions is used to illustrate normalization approaches which transform the peak cube (A) to representations of the data that better reflect the nature of the sample (B–D). (B) Image B represents these ions divided by sum intensity at each position, and then each image in the cube corresponds to a centroid-m/z scaled by the pixel with the highest intensity. (C) Image C represents these ions divided by sum intensity at each position, and then rescaled between the 1st and 99th percentile of intensities. (D) Image D represents these ions rescaled between the 5th and 95th percentile of intensities. The MSI dataset used can be found in OpenMSl [22].

and ion abundance (or intensity) ratios can be converted to absolute amounts [35]. Several quantification MSI strategies have been applied to quantify peptides, proteins and pharmaceutical drugs and their metabolites in tissue specimen. Examples include the quantitation of neurotransmitters, precursors and metabolites in brain tissue sections and the absolute quantification of rifampicin in liver tissue [36,37].

4.3 Untargeted Analysis: Feature Detection, Multivariate Statistics Finding meaningful signals in the sometimes overwhelming amount of data acquired with MSI is greatly aided by algorithms that identify trends and prioritize chemical features that best represent particular trends. Analysis guided only by visual interpretation of MSI spectra, and images will have

378

Markus de Raad et al.

difficulty finding important, low-intensity features. Since the dynamic range in intensity of MSI data can be as much as four orders of magnitude, only the most intense 10% of signals can be seen by eye. Likewise, many metabolites have similar enough masses that they are not resolved by lower resolution mass spectrometers. Peak detection algorithms, turn raw, continuous spectral profiles into centroided features. When broadcast across the entire dataset, the relative intensity of these centroided features comprises an image. By collecting all the images for a dataset, one can assemble the necessary values for statistical analysis. Untargeted analyses typically use clustering, classification and factorization algorithms. These approaches find associations among the ions, pixels or both simultaneously. Broadly speaking, clustering is used to guide the combining of pixels with similar spectra and ions having similar spatial distributions; classification is used to recognize spectral patterns that distinguish different regions of an image and factorization is used to represent spatial, spectra or both using high-value features in a lower dimensionality than the original dataset. Classification in MSI is the assignment of regions of an image to a predefined class based on the properties in the spectra for that region. Classes could be cell-type, treatment-group, populations, etc. Typically supervised approaches are used to identify classes in regions based on their spectral characteristics. First, a training step is used to learn the spectral characteristics associated with regions labelled with a specific class; then second, the algorithm is used to identify regions as a given class [38]. Partial least squaresdiscriminant analysis (PLS-DA) is often the first algorithm considered for these approaches by the chemoinformatics community [39]. Random forest classifiers are also used for supervised classification [40]. All supervised machine-learning approaches can suffer from pitfalls related to overfitting and zeroing in on unimportant signal characteristics, but nevertheless, given sufficient labelled training data and a problem with a high signal-to-noise solution, these approaches have been shown to work well. PLS-DA is implemented in the Cardinal R package for MSI for datasets that fit in memory [16]. Shown in Fig. 4A are 12 panels generated by application of k-means clustering to a centroided MSI dataset of a flaxseed seedpod [10,41]. Each pixel is assigned to 1 of 12 clusters. In segmentation or clustering, the MSI dataset is represented in a single image where regions of distinct molecular composition are grouped, usually by colour coding. Here, pixels with more

Fig. 4 Clustering and factorization. (A) k-means clustering can efficiently separate pixels according to their spectral differences. Twelve different spatial clusters were identified from the peak cube intensities. (B) Factorization techniques, such as nonnegative matrix factorization (NMF) shown here, compute a lower dimensional representation of the peak cube. Here, three components are shown. For each component an image (B1–B3) and a spectrum (B4–B6) can be seen from their coefficients. B1 and B4 are associated with the same NMF component; B2 and B5 are associated with the same NMF component and B3 and B6 are associated with the same component. Used in this way, NMF can often reduce the dimensionality of the peak cube from 1000s of m/z centroids to 10–20 components, where general trends can be found quickly.

380

Markus de Raad et al.

similar spectra will be assigned to the same cluster. Immediately obvious in this example, are pixels assigned to different regions of the seedpod and regions assigned to background locations. Since a colour is assigned to a cluster, and not to a distinct region, the segmentation map can have several spatially disconnected regions of the same colour. Clustering approaches provide a relatively efficient means of subdividing a dataset. A good approach would be to first, assign pixels as either background or sample. Then remove the background pixels and proceed with a rigorous analysis of the sample pixels. Several advanced segmentation methods have been developed, including the abovementioned k-means clustering, hierarchical clustering, clustering with edge-preserving image denoising and efficient spatially aware clustering [27,42]. Factorization is often applied in biological studies to cluster or classify observables; however, the nature of factorization algorithms differs strongly from clustering algorithms. Matrix factorization involves the decomposition of one matrix into the product of two new matrices. For MSI, factorization of a matrix into 20 components starting with n locations and m ions will yield two matrices: one sized [n  20] and one sized [20  m]. In Fig. 4B, the panels show both a spectral matrix and a positional matrix for three components factored from the original dataset with nonnegative matrix factorization (NMF). Care must be taken when interpreting components individually, but generally this approach can be used to simultaneously understand how spectra and spatial variation covary in a dataset. Factorization approaches used in MSI are NMF and maximum autocorrelation factorization [42,43].

4.4 Compound Identification Typical MSI datasets differ from liquid chromatography-mass spectrometry (LC-MS/MSMS) datasets in that they often lack fragmentation spectra for measured features [20]. There are examples where MSMS data are acquired for MSI datasets, but it is far less common than in LC-MS/MSMS where fragmentation spectra are abundant and routinely collected [37]. Since MS1 is the primary measurement in MSI, there is not structural information obtained about the molecule. Consequently, isotope ratios and database searching are the primary means of guessing what molecules comprise a given feature. A new approach to estimate the false-discovery rate for compound identification based on MS1 data only has recently emerged that provides a score to judge the correctness of a result [20].

Mass Spectrometry Imaging Datasets

381

4.5 Stable Isotope Labelling: Isotopic Enrichment Calculations MS imaging of tissues enables the spatial mapping of molecular composition; however, the resulting images are a static snapshot in time of molecules involved in highly dynamic processes. By incorporating stable isotopes into tissues, and thus their biosynthetic pathways, it provides a unique opportunity to study molecular kinetics in biological systems. Stable isotope labelling in MSI has been applied to study dynamic molecular changes of amino acids within biological tissues by measuring the dilution and conversion of 13C6labelled l-phenylalanine in a mouse model and revealed heterogeneous spatial distributions of newly synthesized vs preexisting lipids within a tumour using in vivo metabolic labelling of tissue with deuterium [44,45]. For the analysis of the kinetics, specialty algorithms were developed to determine isotopic enrichment in MSI datasets [44,45]. Isotope enrichment is determined by integrating the measured intensity of individual isotopologues and subtracting the intensity expected by natural isotope incorporation. This approach is widely applied in LC-MS, where signals from individual molecules are well separated. In contrast, in MSI molecular signals are often overlapping and must be deconvoluted. First, the isotopic intensities due to natural isotope incorporation are estimated or measured. These intensities form a pattern, which is weighted by a coefficient to give the unenriched or naturally occurring intensity of a molecule. If the amount of isotopic enrichment is known, then an enriched isotopic pattern can be specified for each molecule. The coefficients for enriched and unenriched can be calculated using an optimization approach such as nonnegative least squares. This gives the amount of preexisting and newly synthesized signal for each molecule [44–46].

4.6 Analyzing Spatial Defined Samples in MSI Datasets Besides analyzing the spatial distribution of biomolecules within tissues, MSI can also be directly applied to measure and compare thousands of samples intentionally spotted or stamped in a spatially defined manner [47,48]. Thus, enabling increased throughput which is often highly desirable by decreasing costs and enabling large-scale studies [49]. Numerous MSI platforms capable of analyzing spatial defined samples in a high-throughput fashion have emerged and can be applied towards high-throughput enzyme activity screening, the characterization of peptide microarrays or the screening of compound libraries [49–52]. Although standard MSI software tools can be used, analyzing spatially defined samples has it is own unique challenges.

382

Markus de Raad et al.

To address these challenges, de Raad et al. developed a computationally efficient and easy-to-use algorithm, the OpenMSI Arrayed Analysis Toolkit (OMAAT) that enables application scientists to quickly and reliably analyse thousands of spatially defined samples via MSI [52]. By using a web-based python notebook (Jupyter), OMAAT is accessible to anyone without programming experience yet allows experienced users to leverage all features, and runs on all major operating systems. The source code can be obtained from the following GitHub repository: https://github.com/biorack/omaat. OMAAT was used to analyse an MSI dataset of a high-throughput glycoside hydrolase activity screen comprising 384 samples and for the screening of metabolic activities of different sized soil particles in a 1536 sample-sized screen [52].

5. FUTURE DIRECTIONS There are many exciting new areas of research that have the potential to greatly impact the future of MSI. As a relatively new technique, there are many areas of investigation that MSI is uniquely suited to address. Proteomic and metabolomic measurements yield the average levels from homogenized samples and are blind to the spatial heterogeneity and subpopulations present in a sample. In comparison, MSI can directly address the ‘needle-in-ahaystack’ problem where one can detect a small position in the sample that has completely different peptides or metabolites than the bulk of the sample. By traditional grind-and-find, a small subpopulation like this would likely be overwhelmed by the majority signals in the sample. The MSI community is embracing important standard practices that will likely have a great impact on the future of the field. There are now numerous open-access and open-source software packages available for the analysis of MSI data. It was only a few years ago that simply opening an MSI file and retrieving images and spectra required the assistance of a someone with computer programming expertise. Another major standard that is seeing widespread adoption is the imzML file format. This is important to more easily develop software that does not need to support an infinite diversity of file formats for input. Although not widely adopted, there is a noticeable increase in the accompanying raw data and reporting standards for publications on MSI. Along the same lines, there has been some work towards standardizing the reporting and evidence for identified compounds in an MSI experiment. As MSI acquisition technologies change, the analysis tools required will necessarily change. One of the main bottlenecks in MSI is the long

Mass Spectrometry Imaging Datasets

383

acquisition time required to obtain a high-resolution MSI file. This can easily be tens of hours for a single image. An exciting area that is aiming to drastically shorten this is ‘whole image cameras’. These devices could potentially shorten acquisition times by multiple orders of magnitude. Likewise, new desorption and ionization techniques are yielding spectra that contain a greater chemical diversity than previously possible. As these technologies mature, we will see MSI datasets that can be quickly acquired and containing a comprehensive view of spatially resolved metabolism.

ACKNOWLEDGEMENTS This work was supported by the U.S. Department of Energy Office of Science by the Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) Program under Contract No. DE-AC02-05CH11231.

REFERENCES [1] J.D. Watrous, P.C. Dorrestein, Imaging mass spectrometry in microbiology, Nat. Rev. Microbiol. 9 (2011) 683–694, https://doi.org/10.1038/nrmicro2634. [2] L.P. Silva, T.R. Northen, Exometabolomics and MSI: deconstructing how cells interact to transform their small molecule environment, Curr. Opin. Biotechnol. 34 (2015) 209–216, https://doi.org/10.1016/j.copbio.2015.03.015. [3] P. Nemes, A. Vertes, Laser ablation electrospray ionization for atmospheric pressure, in vivo, and imaging mass spectrometry, Anal. Chem. 79 (2007) 8098–8106, https:// doi.org/10.1021/ac071181r. [4] T.R. Northen, O. Yanes, M.T. Northen, D. Marrinucci, W. Uritboonthai, J. Apon, S.L. Golledge, A. Nordstr€ om, G. Siuzdak, Clathrate nanostructures for mass spectrometry, Nature 449 (2007) 1033–1036, https://doi.org/10.1038/nature06195. [5] D. Miura, Y. Fujimura, H. Tachibana, H. Wariishi, Highly sensitive matrix-assisted laser desorption ionization-mass spectrometry for high-throughput metabolic profiling, Anal. Chem. 82 (2010) 498–504, https://doi.org/10.1021/ac901083a. [6] J.A. Stolee, B.N. Walker, V. Zorba, R.E. Russo, A. Vertes, Laser–nanostructure interactions for ion production, Phys. Chem. Chem. Phys. 14 (2012) 8453, https://doi.org/ 10.1039/c2cp00038e. [7] D.G. Beach, C.M. Walsh, P. McCarron, High-throughput quantitative analysis of domoic acid directly from mussel tissue using laser ablation electrospray ionization— tandem mass spectrometry, Toxicon 92 (2014) 75–80, https://doi.org/10.1016/ j.toxicon.2014.10.009. [8] L.-P. Li, B.-S. Feng, J.-W. Yang, C.-L. Chang, Y. Bai, H.-W. Liu, Applications of ambient mass spectrometry in high-throughput screening, Analyst 138 (2013) 3097–3103, https://doi.org/10.1039/c3an00119a. [9] F. Suits, T.E. Fehniger, A´. Vegva´ri, G. Marko-Varga, P. Horvatovich, Correlation queries for mass spectrometry imaging, Anal. Chem. 85 (2013) 4398–4404, https:// doi.org/10.1021/ac303658t. [10] O. R€ ubel, A. Greiner, S. Cholia, K. Louie, E.W. Bethel, T.R. Northen, B.P. Bowen, OpenMSI: a high-performance web-based platform for mass spectrometry imaging, Anal. Chem. 85 (2013) 10354–10361, https://doi.org/10.1021/ac402540a. [11] M.-F. Robbe, J.-P. Both, B. Prideaux, I. Klinkert, V. Picaud, T. Schramm, A. Hester, V. Guevara, M. Stoeckli, A. Roempp, R.M.A. Heeren, B. Spengler, O. Gala, S. Haan,

384

[12] [13]

[14]

[15]

[16]

[17]

[18] [19] [20]

[21]

[22]

[23] [24]

Markus de Raad et al.

Software tools of the Computis European project to process mass spectrometry images, Eur. J. Mass Spectrom. 20 (2014) 351–360. Chichester, Eng, http://www.ncbi.nlm.nih. gov/pubmed/25707124. Accessed 19 December 2017. I. Klinkert, K. Chughtai, S.R. Ellis, R.M.A. Heeren, Methods for full resolution data exploration and visualization for large 2D and 3D mass spectrometry imaging datasets, Int. J. Mass Spectrom. 362 (2014) 40–47, https://doi.org/10.1016/J.IJMS.2013.12.012. G. Robichaud, K.P. Garrard, J.A. Barry, D.C. Muddiman, MSiReader: an open-source interface to view and analyze high resolving power MS imaging files on matlab platform, J. Am. Soc. Mass Spectrom. 24 (2013) 718–721, https://doi.org/10.1007/s13361-0130607-z. M.T. Bokhart, M. Nazari, K.P. Garrard, D.C. Muddiman, MSiReader v1.0: evolving open-source mass spectrometry imaging software for targeted and untargeted analyses, J. Am. Soc. Mass Spectrom. 29 (2018) 8–16, https://doi.org/10.1007/s13361-0171809-6. R.M. Parry, A.S. Galhena, C.M. Gamage, R.V. Bennett, M.D. Wang, F.M. Ferna´ndez, OmniSpect: an open MATLAB-based tool for visualization and analysis of matrix-assisted laser desorption/ionization and desorption electrospray ionization mass spectrometry images, J. Am. Soc. Mass Spectrom. 24 (2013) 646–649, https://doi.org/10.1007/ s13361-012-0572-y. K.D. Bemis, A. Harry, L.S. Eberlin, C. Ferreira, S.M. van de Ven, P. Mallick, M. Stolowitz, O. Vitek, Cardinal: an R package for statistical analysis of mass spectrometry-based imaging experiments, Bioinformatics 31 (2015) 2418–2420, https://doi.org/10.1093/bioinformatics/btv146. P. K€allback, A. Nilsson, M. Shariatgorji, P.E. Andren, msIQuant—quantitation software for mass spectrometry imaging enabling fast access, visualization, and analysis of large data sets, Anal. Chem. 88 (2016) 4346–4353, https://doi.org/10.1021/acs. analchem.5b04603. A.M. Race, A.D. Palmer, A. Dexter, R.T. Steven, I.B. Styles, J. Bunch, Spectral analysis: software for the masses, Anal. Chem. 88 (2016) 9451–9458, https://doi.org/ 10.1021/acs.analchem.6b01643. A. R€ ompp, R. Wang, J.P. Albar, A. Urbani, H. Hermjakob, B. Spengler, J.A. Vizcaı´no, A public repository for mass spectrometry imaging data, Anal. Bioanal. Chem. 407 (2015) 2027–2033, https://doi.org/10.1007/s00216-014-8357-8. A. Palmer, P. Phapale, I. Chernyavsky, R. Lavigne, D. Fay, A. Tarasov, V. Kovalev, J. Fuchser, S. Nikolenko, C. Pineau, M. Becker, T. Alexandrov, FDR-controlled metabolite annotation for high-resolution imaging mass spectrometry, Nat. Methods 14 (2016) 57–60, https://doi.org/10.1038/nmeth.4072. ockli, A. R€ ompp, T. Schramm, A. Hester, I. Klinkert, J.-P. Both, R.M.A. Heeren, M. St€ B. Spengler, imzML: imaging mass spectrometry markup language: a common data format for mass spectrometry imaging, Methods Mol. Biol. (2011) 205–224, https://doi. org/10.1007/978-1-60761-987-1_12. OpenMSI—Flax_Pod_12_day_old_CS, (n.d.). https://openmsi.nersc.gov/openmsi/ client/viewer?cursorCol2¼194&cursorCol1¼97&rangeValue¼0.0999988878038& cursorRow1¼101&cursorRow2¼202&image_name¼Flax_Pod_12_day_old_CS.h5 &channel3Value¼639.9939&enableClientCache¼true&channel2Value¼459.9959& dataIndex¼0&fil (accessed March 21, 2018). C.S. Sun, M.K. Markey, Recent advances in computational analysis of mass spectrometry for proteomic profiling, J. Mass Spectrom. 46 (2011) 443–456, https://doi.org/ 10.1002/jms.1909. H. Shin, M.P. Sampat, J.M. Koomen, M.K. Markey, Wavelet-based adaptive denoising and baseline correction for MALDI TOF MS, OMICS 14 (2010) 283–295, https://doi. org/10.1089/omi.2009.0119.

Mass Spectrometry Imaging Datasets

385

[25] C. Rowlands, S. Elliott, Automated algorithm for baseline subtraction in spectra, J. Raman Spectrosc. 42 (2011) 363–369, https://doi.org/10.1002/jrs.2691. [26] J.L. Norris, D.S. Cornett, J.A. Mobley, M. Andersson, E.H. Seeley, P. Chaurand, R.M. Caprioli, Processing MALDI mass spectra to improve mass spectral direct tissue analysis, Int. J. Mass Spectrom. 260 (2007) 212–221, https://doi.org/10.1016/ j.ijms.2006.10.005. [27] T. Alexandrov, MALDI imaging mass spectrometry: statistical data analysis and current computational challenges, BMC Bioinformatics 13 (Suppl. 16) (2012) S11, https://doi. org/10.1186/1471-2105-13-S16-S11. [28] J. Hanrieder, A. Ljungdahl, M. Andersson, MALDI imaging mass spectrometry of neuropeptides in Parkinson’s disease, J. Vis. Exp. 60 (2012) pii: 3445, https://doi. org/10.3791/3445. [29] E. Lange, C. Gr€ opl, K. Reinert, O. Kohlbacher, A. Hildebrandt, In: High-accuracy peak picking of proteomics data using wavelet techniques, Proceedings of the 11th Pacific Symposium on Biocomputing, 2006, pp. 243–254. World Scientific, 2005. https://doi.org/10.1142/9789812701626_0023. [30] S.-O. Deininger, D.S. Cornett, R. Paape, M. Becker, C. Pineau, S. Rauser, A. Walch, E. Wolski, Normalization in MALDI-TOF imaging datasets of proteins: practical considerations, Anal. Bioanal. Chem. 401 (2011) 167–181, https://doi.org/10.1007/ s00216-011-4929-z. ocker, Correcting mass shifts: a lock [31] P. Kulkarni, F. Kaftan, P. Kynast, A. Svatosˇ, S. B€ mass-free recalibration procedure for mass spectrometry imaging data, Anal. Bioanal. Chem. 407 (2015) 7603–7613, https://doi.org/10.1007/s00216-015-8935-4. [32] J.A. Barry, G. Robichaud, D.C. Muddiman, Mass recalibration of FT-ICR mass spectrometry imaging data using the average frequency shift of ambient ions, J. Am. Soc. Mass Spectrom. 24 (2013) 1137–1145, https://doi.org/10.1007/s13361-013-0659-0. [33] A.N. Kozhinov, K.O. Zhurov, Y.O. Tsybin, Iterative method for mass spectra recalibration via empirical estimation of the mass calibration function for Fourier transform mass spectrometry-based petroleomics, Anal. Chem. 85 (2013) 6437–6445, https://doi. org/10.1021/ac400972y. [34] P. K€allback, M. Shariatgorji, A. Nilsson, Novel mass spectrometry imaging software assisting labeled normalization and quantitation of drugs and neuropeptides directly in tissue sections, J. Proteomics 75 (2012) 4941–4951, https://doi.org/10.1016/ J.JPROT.2012.07.034. [35] M.W. Duncan, H. Roder, S.W. Hunsucker, Quantitative matrix-assisted laser desorption/ionization mass spectrometry, Brief. Funct. Genomic. Proteomic. 7 (2008) 355–370, https://doi.org/10.1093/bfgp/eln041. [36] M. Shariatgorji, A. Nilsson, R.J.A. Goodwin, P. K€allback, N. Schintu, X. Zhang, A.R. Crossman, E. Bezard, P. Svenningsson, P.E. Andren, Direct targeted quantitative molecular imaging of neurotransmitters in brain tissue sections, Neuron 84 (2014) 697–707, https://doi.org/10.1016/j.neuron.2014.10.011. [37] B.M. Prentice, C.W. Chumbley, R.M. Caprioli, Absolute quantification of rifampicin by MALDI imaging mass spectrometry using multiple TOF/TOF events in a single laser shot, J. Am. Soc. Mass Spectrom. 28 (2017) 136–144, https://doi.org/10.1007/ s13361-016-1501-2. [38] M. Pietrowska, H.C. Diehl, G. Mrukwa, M. Kalinowska-Herok, M. Gawin, M. Chekan, J. Elm, G. Drazek, A. Krawczyk, D. Lange, H.E. Meyer, J. Polanska, C. Henkel, P. Widlak, Molecular profiles of thyroid cancer subtypes: classification based on features of tissue revealed by mass spectrometry imaging, Biochim. Biophys. Acta Proteins Proteom. 1865 (2017) 837–845, https://doi.org/10.1016/j.bbapap.2016.10.006. [39] P.S. Gromski, H. Muhamadali, D.I. Ellis, Y. Xu, E. Correa, M.L. Turner, R. Goodacre, A tutorial review: metabolomics and partial least squares-discriminant analysis—a

386

[40]

[41]

[42]

[43]

[44]

[45] [46] [47] [48]

[49] [50] [51]

[52]

Markus de Raad et al.

marriage of convenience or a shotgun wedding, Anal. Chim. Acta 879 (2015) 10–23, https://doi.org/10.1016/j.aca.2015.02.012. M. Hanselmann, U. K€ othe, M. Kirchner, B.Y. Renard, E.R. Amstalden, K. Glunde, R.M.A. Heeren, F.A. Hamprecht, Toward digital staining using imaging mass spectrometry and random forests, J. Proteome Res. 8 (2009) 3558–3567, https://doi.org/ 10.1021/pr900253y. D.S. Dalisay, K.W. Kim, C. Lee, H. Yang, O. R€ ubel, B.P. Bowen, L.B. Davin, N.G. Lewis, Dirigent protein-mediated lignan and cyanogenic glucoside formation in flax seed: integrated omics and MALDI mass spectrometry imaging, J. Nat. Prod. 78 (2015) 1231–1242, https://doi.org/10.1021/acs.jnatprod.5b00023. E.A. Jones, A. Van Remoortere, R.J.M. Van Zeijl, P.C.W. Hogendoorn, J.V.M.G. Bovee, A.M. Deelder, L.A. Mcdonnell, Multiple statistical analysis techniques corroborate intratumor heterogeneity in imaging mass spectrometry datasets of myxofibrosarcoma, PLoS One 6 (2011), https://doi.org/10.1371/journal.pone.0024913. P.W. Siy, R.A. Moffitt, R.M. Parry, Y. Chen, Y. Liu, M.C. Sullards, A.H. Merrill, M.D. Wang, in: Matrix factorization techniques for analysis of imaging mass spectrometry data, 2008 8th IEEE International Conference on BioInformatics and BioEngineering, IEEE, 2008, pp. 1–6, https://doi.org/10.1109/BIBE.2008.4696797. M. Arts, Z. Soons, S.R. Ellis, K.A. Pierzchalski, B. Balluff, G.B. Eijkel, L.J. Dubois, N.G. Lieuwes, S.M. Agten, T.M. Hackeng, L.J.C. van Loon, R.M.A. Heeren, S.W.M. Olde Damink, Detection of localized hepatocellular amino acid kinetics by using mass spectrometry imaging of stable isotopes, Angew. Chemie Int. Ed. Engl. 56 (2017) 7146–7150, https://doi.org/10.1002/anie.201702669. K.B. Louie, B.P. Bowen, S. McAlhany, Y. Huang, J.C. Price, J. Mao, M. Hellerstein, T.R. Northen, Mass spectrometry imaging for in situ kinetic histochemistry, Sci. Rep. 3 (2013) 1656, https://doi.org/10.1038/srep01656. K. Louie, B. Bowen, R. Lau, T. Northen, Localizing metabolic synthesis in microbial cultures with kinetic mass spectrometry imaging (kMSI), bioRxiv (2016) 50658, https://doi.org/10.1101/050658. W. Reindl, T.R. Northen, Rapid screening of fatty acids using nanostructure-initiator mass spectrometry, Anal. Chem. 82 (2010) 3751–3755, https://doi.org/10.1021/ ac100159y. R.A. Heins, X. Cheng, S. Nath, K. Deng, B.P. Bowen, D.C. Chivian, S. Datta, G.D. Friedland, P.D.0 . Haeseleer, D. Wu, M. Tran-Gyamfi, C.S. Scullin, S. Singh, W. Shi, M.G. Hamilton, M.L. Bendall, A. Sczyrba, J. Thompson, T. Feldman, J.M. Guenther, J.M. Gladden, J.-F. Cheng, P.D. Adams, E.M. Rubin, B.A. Simmons, K.L. Sale, T.R. Northen, S. Deutsch, Phylogenomically guided identification of industrially relevant GH1 β-glucosidases through DNA synthesis and nanostructure-initiator mass spectrometry, ACS Chem. Biol. 9 (2014) 2082–2091, https://doi.org/10.1021/ cb500244v. T. de Rond, M. Danielewicz, T. Northen, High throughput screening of enzyme activity with mass spectrometry imaging, Curr. Opin. Biotechnol. 31C (2014) 1–9, https:// doi.org/10.1016/j.copbio.2014.07.008. J. Ghyselinck, K. Van Hoorde, B. Hoste, K. Heylen, P. De Vos, Evaluation of MALDITOF MS as a tool for high-throughput dereplication, J. Microbiol. Methods 86 (2011) 327–336, https://doi.org/10.1016/j.mimet.2011.06.004. S.K. K€ uster, S.R. Fagerer, P.E. Verboket, K. Eyer, K. Jefimovs, R. Zenobi, P.S. Dittrich, Interfacing droplet microfluidics with matrix-assisted laser desorption/ ionization mass spectrometry: label-free content analysis of single droplets, Anal. Chem. 85 (2013) 1285–1289, https://doi.org/10.1021/ac3033189. M. de Raad, T. de Rond, O. R€ ubel, J.D. Keasling, T.R. Northen, B.P. Bowen, OpenMSI arrayed analysis toolkit: analyzing spatially defined samples using mass spectrometry imaging, Anal. Chem. 89 (2017) 5818–5823, https://doi.org/10.1021/acs. analchem.6b05004.