Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
Contents lists available at ScienceDirect
Chemometrics and Intelligent Laboratory Systems journal homepage: www.elsevier.com/locate/chemolab
Quality assessment of boar semen by multivariate analysis of flow cytometric data Hamid Babamoradia,⁎, José Manuel Amigo a, Frans van den Berg a, Morten Rønn Petersenb, Nana Satakec, Gry Boe-Hansenc a b c
Spectroscopy & Chemometrics, Dept. of Food Science, University of Copenhagen, DK-1958 Frederiksberg, Denmark The Fertility Clinic, University Hospital of Copenhagen, Copenhagen, Denmark School of Veterinary Science, University of Queensland, Gatton, Queensland 4343, Australia
a r t i c l e
i n f o
Article history: Received 6 October 2014 Received in revised form 26 January 2015 Accepted 8 February 2015 Available online 14 February 2015 Keywords: Flow cytometry k-Means OPTICS ASCA Normalization Boar semen
a b s t r a c t Flow cytometry (FCM) has become very powerful over the last decades, enabling multi-parametric measurements of up to thousands of cells per second. This generates massive amounts of data on individual cell characteristics that need to be analyzed in an efficient manner from both physiological and chemical points of view. In this study, a methodology of analysis for FCM data was comprehensively studied to assess quality changes in semen extracted from boars. The proposed methodology combines new automated multi-dimensional data normalization, a density-based clustering method for identification of cell populations, and multivariate methods for post-analysis of the identified populations, enabling the exploratory evaluation and prediction/classification of subpopulations within the experimental data set. The performance of the suggested methodology was compared with the performance of an existing automated clustering method. © 2015 Elsevier B.V. All rights reserved.
1. Introduction Analytical flow cytometry (FCM) has been widely used in a wide range of applications such as cell function [1], cell death [2], DNA and RNA analysis [3], and immunophenotyping [4]. FCM is a fluorescencebased technique that analyzes multiple parameters on a single cell level or within a population of cells, enabling physical and chemical characterization of up to thousands of cells per second. The technique is used extensively, both in research and routine settings, generating a vast amount of data for a single experiment that need to be analyzed efficiently and effectively in order to utilize the systems to their full potential. FCM data analysis is usually performed in two steps: first, identification of cell populations; and second, post analysis of data extracted from the identified cell populations. The first step is related to the identification of cell populations (i.e., gating). Automated gating methods have been extensively studied and many have been proposed [5]. Nevertheless, every automated method must be adapted to the different kind of FCM data and, therefore, manual gating is extensively used. Manual gating requires operator decision making to determine cell populations, having disadvantages with nongeneralizable solutions (i.e., time-consuming and subjective, often ⁎ Corresponding author. Tel.: +45 35333500; fax: +45 35333245. E-mail address:
[email protected] (H. Babamoradi).
http://dx.doi.org/10.1016/j.chemolab.2015.02.008 0169-7439/© 2015 Elsevier B.V. All rights reserved.
providing erroneous and irreproducible results) [6,7]. On the other hand, semi-automated methods that are based on mathematical and statistical approaches are preferred since manual gating is minimized without losing accuracy. The second step is directly linked to the analysis of the detected cell populations. For this analysis, one- and two-way analysis of variance (ANOVA) is used [8–11]. ANOVA assumes that the data extracted from the populations (represented by variables) are independent. Nevertheless, in most cases there are only a few sources of variation among populations and each variable or attribute extracted from the populations is simply a different reflection of these few sources of variation [6,12–16]. Therefore, this perspective makes the univariate analysis of the extracted data and thus the interpretation extremely cumbersome. The multivariate philosophy though aims at extracting information on the principal direction of the variations, improving the signal to noise ratio and highlighting the main sources of variance in the collected data [17]. Interpretation of such few main sources of variation is much easier and enables a better understanding of the underlying biology. This is only possible by employing multivariate analysis. From a biological perspective, semen in its natural state is a heterogeneous cell suspension consisting of different sub-populations of spermatozoa with variable attributes [18]. Hence, sperm function analysis by FCM methodology with commercially available fluorescent probes has been extensively used [19]. Data analysis by objective clustering methods has been previously used to examine other sperm attributes
220
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
that does not permit FCM assessment. Data obtained through computer assisted semen analysis (CASA) technique assess multiple kinematic measures of sperm motion [20,21], traditionally tested on averages of the whole population. However, multi-parameter CASA data permit cluster analysis to be used to enhance statistical analysis due to the heterogeneity of the sperm motility within an ejaculate [22–24]. Therefore, an objective approach to analyze spermatozoa derived FCM data would help to understand population dynamics within an ejaculate and describe differences in subpopulations within and between individuals. These differences are important in determination of the semen quality among ejaculates. Semen quality is a contributing factor to fertility that can be used in evaluation of boar fertility. This study examines clustering methods for automated data preprocessing and semi-automated identification of cell populations for automated gating, and multivariate analytical methods for the study of the detected cell populations. For this purpose, an experimentally controlled FCM dataset from boar semen was used in order to assess the seasonal and within-boar variation of the quality of the semen. The aim of the analysis is to study the effect of breed type and seasonal change on the quality of boar semen. The dataset consisted of flow cytometric measurements of semen ejaculates collected from five boars over five periods of time (25 ejaculates), where four attributes were measured for each ejaculate. This dataset was chosen for this study to introduce a comprehensive workflow for processing large and complex data obtained from flow cytometric systems, where different challenges are addressed. The results of clustering were compared between the proposed clustering method and a completely automated method. 2. Semen sample collection and data acquisition Ejaculates of mature Large White boars (n = 5; Premier Pig Genetics, Wacol Pig AB Centre, Australia) were collected using the gloved-hand technique (Hancock 1967), extended in a commercial semen extender (1:1 semen to extender), Androstar (Minitube), and transported to the laboratory and stored at ambient temperature. Ejaculates were collected from the same boars over five dates over a 4 month period (06/12, 27/12, 24/01, 21/02, 14/03). Flow cytometric analysis was carried out using a Beckman Coulter Gallios™ flow cytometer (Beckman Coulter PTYLTD, Gladesville, Australia) on the extended semen samples on arrival to the laboratory for the following fluorescent attributes: Viability (PI/SYBR14), acrosome integrity (PI/FITC-PNA), membrane fluidity (YoPro-1/M540), and reactive oxygen species (ROS) production (PI/H2DCFDA). A prototypical mature ejaculated spermatozoa population would for these attributes have the following functional attributes; viable (SYBR14 positive), intact acrosome (FITC-PNA negative), impervious membrane fluidity and non-apoptotic; (YoPro-1 and M540 negative) and low in ROS production (H2DCFDA negative). Sperm suspensions were loaded with fluorescent probes for flow cytometric analysis, performed on a Gallios™ flow cytometer (Beckman Coulter Pty Ltd, Gladesville, Australia). 10,000 events were assessed per semen sample, at the rate of 50–200 events per second. Forward (FS) and side (SS) light scatter in linear mode was used. All fluorescent probes utilized within this study were excited using a 15 mW argon ion solid state laser at 488 nm and the fluorescence emissions of stained spermatozoa were analyzed in logarithmic mode using appropriate detectors for the emission spectra of the fluorescent probes. Dual staining for 10 min at 38 °C, were carried out in order to assess the following sperm functions: Viability using 100 nM of SYBR14 (emmax 516 nm) and 12 μM of propidium iodide (PI; emmax 625 nm); acrosome integrity using 5 μg/ml of FITC-PNA (emmax 530 nm) and 12 μM of PI; membrane fluidity using 25 nM of merocyanine 540 (M540, emmax 590 nm) and 25 nM of YoPro-1 (emmax 509 nm); and ROS production using 1 μM of carboxyfluoresceindiacetate (H2DCFDA, emmax 525 nm) and 12 μM of PI. Fluorescence emission of FITC-PNA, Yo Pro-1, and H2DCFDA were detected using bandpass filter 525 nm, FL1; emission of PI was detected on bandpass filter 620/30 nm, FL3; and emission of M540 was detected
on bandpass filter 575/30 nm and 620/30 nm, FL2 and FL3 respectively. The results were transformed into listmode files (.lmd) using Kaluza Analysis Software (v1.2 Beckman Coulter Australia Pty Ltd, Gladesville, Australia). The MATLAB code used for OPTICS algorithm is available at http://chemometria.us.edu.pl/ while ASCA was applied using Eigenvector's PLS-toolbox for MATLAB. The source code for SWIFT can be downloaded from: http://www.ece.rochester.edu/projects/siplab/ Software/SWIFT.html. The semen samples were collected between December 2010 and March 2011, and approved by the University of Queensland Animal Ethics Committee; SVS/555/09/PORK CRC. The data contains 100 measurements, 25 measurements per attribute. A measurement is recordings of scatter and fluorescence for 10,000 cells per ejaculate. 3. Identification of cell population 3.1. Automated multi-channel data normalization FCM datasets may typically be affected from small changes in the fluorescence signal across different measurements, i.e., the positions of the same cell population may shift between measurements [25]. This issue hinders an accurate data analysis since measurements are analyzed simultaneously and the same population in two measurements might be incorrectly assigned to two different cell categories due to instrumental differences. Therefore, the data need to be preprocessed in order to minimize the shifts (technical variations). This is even more crucial when automated methods are utilized to analyze FCM data since they cannot distinguish between biological and technical variations. Normalization [25], Standardization [26], quality assessment and control [27,28] are used for referring the procedures for minimizing technical variations in FCM data. Normalization will be used here since it is the specific term for minimizing technical variations across measurements. The most straightforward way of normalization is to adjust the data for each channel (fluorochrome) independently [25]. This strategy does not take into consideration the correlation between channels in minimizing the shifts, which might be acceptable when data of each channel is analyzed also independently. However, most FCM analysis procedures use several channels simultaneously. It is thus reasonable to take this correlation also into account during data normalization, assuring that no artifacts are introduced. Typically, two channels are analyzed simultaneously [29] where the data is visualized through scatter plots to identify cell populations and perform a two-dimensional normalization to minimize the shifts. In this work multi-dimensional normalization is proposed using a combination of k-means clustering [30] and Procrustes analysis [31]. The algorithm estimates the cluster centers once for the entire dataset to define target centers. Then, it estimates the cluster centers for each measurement using target centers as an initial estimate for the k-means algorithm. It calculates Procrustes analysis components – including a translation, orthogonal rotation, and scaling operation (no reflection) – to match the centers of each measurement to the target centers. For each measurement, the components are applied to the data of all cells to normalize the data towards the target centers. The inputs for the algorithm are the entire dataset and number of clusters (k). The algorithm works as follows: 1. Compute centers of k clusters as target centers using entire measurements (input is measurements concatenated across the row direction — a matrix of size 25 × 10,000 by number of channels) 2. Use the target centers as the initial estimate for k-means clustering to compute centers of k clusters for a measurement (input is a matrix of size 10,000 by number of channels) 3. Translate the centers towards the target centers 4. Orthogonally rotate the centers to match the target centers 5. Scale the centers so that their Frobenius norm matches the norm of the target centers
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
6. Store the translation, rotation, and scaling components 7. Apply Procrustes analysis to data of all cells for the measurement using the stored components 8. Apply step 2 to 7 for all remaining measurements individually Cluster centers can be determined by calculating the mean, median, or mode of the clusters. The arithmetic mean is too sensitive to the tail of the distribution. Thus, it can easily be pulled away from the middle of the peak by a few extreme values (outliers/noise). Median and mode are less sensitive too extreme values, but the mode can be away from the peak entirely when there is a pileup of off-scale cells at either end of the histogram. Therefore, the median is preferred as the most robust estimate of intensity characterizing a population of cells [32]. The suggested normalization algorithm is an automated approach that only needs the correct number of clusters in the data as input, which is defined by an expert. The approach can perform multi-dimensional data normalization since both k-means clustering and Procrustes analysis can handle data of high dimensionality. Note that in case of onedimensional data normalization the orthogonal rotation component is equal to unity since no rotation is possible for this case. In order to minimize the risk of sub-optimal data partitioning, the correct number of clusters should be selected for the normalization algorithm. 3.2. Manual gating Once the FCM data is normalized, they can be analyzed to identify different cell populations based on their fluorescence responses. FCM manual gating is a common method to identify cell subsets [6], where data are visualized graphically using histograms or scatter/density plots, determining manually the cell subsets one at a time. The gate boundaries are set manually based on the experience of the users and knowledge of the data. The main disadvantages are that the method needs to be performed by an expert and it is limited to threedimensional data because visualization in higher dimensions becomes cumbersome. One way to overcome the latter is to reduce the dimensionality by principal component analysis (PCA) with gating using the scatter plots of the score-values [13,33]. However, this solution will not help if more than three principal components (PCs) are needed to model the data. In the case of compact data with overlapping subpopulations, visualization of data and thus selection of subpopulations become very difficult. This leads to poor results when manual gating is used [6]. Manual gating is used in this work as the reference method to evaluate the performance of the combination of the normalization and OPTICS. 3.3. Optics The so-called Ordering Points To Identify the Clustering Structure (OPTICS) is a hierarchical clustering method that generalizes densitybased clustering [34,35]. OPTICS shows the data structure by sorting data objects according to the reachability distance (RD), a measure based on the distance of each point to the other points. This provides a plot called reachability plot that is similar to a dendrogram in hierarchical clustering which can be used both manually or in an automated way to cluster the data [34,36]. Typical hierarchical clustering algorithms have two disadvantages compared to OPTICS: first, in general it suffers considerably from the single-link effect, i.e., from the fact that clusters that are connected by a line of few points having a small inter-object distance are not separated. Second, the results produced by hierarchical algorithms, i.e., the dendrograms, are hard to understand or analyze for more than a few hundred objects. OPTICS can handle high-dimensional data directly, unlike manual gating. OPTICS can also handle nested clusters, clusters with different densities of any shape, and inliers plus outliers/noise. For very large datasets with millions of cells the OPTICS might be slow but a faster
221
alternative has been suggested [37]. A reachability plot is convertible to a dendrogram if this is the preferred representation [36]. The algorithm needs two input parameters: the scanning radius (ε) (the radius of the neighborhood of each object in a data set) and the minimum number of objects considered as a cluster in the data (MinPts). The parameter ε is normally set to the maximal distance between two objects in the data [35], or infinity [36]. For MinPts, a very large value (10% of the number of objects in the data) results in a smooth reachability plot that captures large clusters, while a small value (1% of the number of objects in the data) results in a less smooth reachability plot that enables detecting very small clusters. OPTICS is not sensitive to MinPts and as a rule of thumb a value between 3 and 5% of the number of objects in the data is suggested for a meaningful representation of the data structure [38]. To understand the OPTICS algorithm two terms need to be described: the core distance (CD) and the reachability distance (RD). CD for object i is the distance between object i to its MinPtsth nearest neighbor, and RD for object i is the maximum of the distance of object i to its nearest neighbor (jth) and core distance of that object (CDj). RD for ith object (with jth object being its nearest neighbor) is defined as: RDi ¼ maxðdij ; CDj Þ
ð1Þ
The algorithm of OPTICS can be described as follow [34,35]: 1. Define the minimal number of objects considered as a cluster MinPts. 2. Calculate CD for all objects. 3. Give all objects an arbitrary large RD value. 4. Randomly select an object or select the first object in the data. 5. Place the selected object as the first in the order list. 6. Mark the selected object as processed and consider it as current object. 7. Update RDs for unprocessed objects with respect to the current object (using CD for current object). 8. If the RD for any unprocessed object is smaller than the RD for the current object with respect to the previous current object, replace the RD for the current object with the current RD. 9. Select the next object with the smallest RD. 10. Mark it as processed and consider it as the current object. 11. Go back to step 5 until all objects are processed. 12. Generate the reachability plot by drawing ordered RDs in a bar plot. Euclidean or squared Euclidean are the most common distances used for OPTICS algorithm [34,35]. Fig. 1 shows a simulated dataset and its corresponding reachability plot resulted from OPTICS. A flowchart illustrating the steps to extract data of cell populations from FCM data is presented in Fig. 2 and includes data transformation, normalization, and clustering. The data of one attribute are loaded and transformed into a logarithmic or another preferred scale [39], and then normalized to minimize the shifts across measurements (see Section 3.1 for details). Once the data are normalized, OPTICS is applied to obtain reachability plot showing the structure of the data. Relevant clusters are determined using the reachability plot and then the number of cells in each cluster for each measurement is counted to construct a data matrix Xcount of size N by K, where N and K are the number of measurements and clusters, respectively. The constructed matrix is then analyzed using the relevant multivariate method depending on the purpose of the analysis: exploratory analysis, classification, or prediction (see Section 4 for details). 3.4. SWIFT clustering Scalable weighted iterative sampling for flow cytometry clustering (SWIFT) is an automated clustering method based on weighted iterative sampling [40] that scales existing statistical model-based clustering
222
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
Fig. 1. Reachability plot for a simulated dataset resulted from OPTICS (MinPts = 40).
methods to large FC datasets. The SWIFT algorithm has three stages. In the first stage, SWIFT introduces a weighted iterative sampling framework for Gaussian mixture modeling by the ExpectationMaximization (EM) algorithm. In the second stage (after mixture model fitting), SWIFT examines each of the Gaussian clusters and splits any clusters that are bimodal along any dimensions or principal components. This stage is often crucial for identifying rare populations in highdimensional flow datasets. Finally, in the third stage, a graph-based merging is applied to fuse strongly overlapping Gaussians based on a normalized overlap measure and an entropy-based stopping criterion. This stage allows the method to represent skewed clusters frequently seen in FC data. The SWIFT algorithm needs an input cluster number that is the number of Gaussian distributions to be used at the first step. This number is not critical since it is substantially modified by splitting and merging steps, but it might improve the clustering results if it is a reasonable approximation of the final number. 4SWIFT was used in this study to compare its performance with OPTICS and it was chosen for this purpose because it can handle large datasets and it is completely automated. 3.5. Evaluation of clustering performance To evaluate the performance of a clustering algorithm, the results are compared with predefined classes in the data. These predefined classes
Fig. 2. A flowchart to extract data of cell populations from FCM data.
are often created by experts thorough manual gating. Statistical measures are then used to measure how close the clustering is to the predetermined classes. The F-measure statistic is often used for the purpose that is the harmonic mean of the precision and recall according to the equation F = (2 × precision × recall) / (precision + recall) [5]. Precision corresponds
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
to the number of cells correctly assigned to a cluster divided by the total cells assigned to that cluster, which is a measure of false positive rate. Recall corresponds to the number of cells correctly assigned to a cluster divided by all the cells that should have been assigned to that cluster, which is a measure of false negative rate. The mentioned formula gives a value of F-measure for each cluster. In case a single value for the whole clustering result is of interest, which is case this study, all clusters is considered simultaneously. The precision thus corresponds to the total number of cells in the data assigned correctly divided by the total number of cells assigned to all the clusters. The recall then corresponds to the total number of cells in the data assigned correctly divided by the total number of cells that should have been assigned to all the clusters. An F-measure of 1.0 indicates perfect reproduction of the manual gating result with no false positive or false negative events. 4. Multivariate data analysis of cell populations Once the cell populations are identified, the variation among them is examined for exploratory analysis or prediction. In exploratory analysis, principal component analysis (PCA) can be used [13,14,41]. In the event that the effects are a result of several sources of variation (e.g., animal and collection time) PCA might be unable to give a clear interpretation due to the interaction among the effects. Therefore, methods like ANOVA–simultaneous component analysis (ASCA) [42] or analysis of variance–principal component analysis (ANOVA–PCA) [43] can be highly effective. For classification purposes partial least squaresdiscriminant analysis (PLS-DA) [44] or support vectors machine (SVM) [5,45] are often used; whereas for prediction regression methods such as partial least squares (PLS) are appropriate choices [15]. ASCA and ANOVA–PCA are similar methods based on PCA that are the most relevant multivariate methods in this study since they enable us to examine the effect of each factor on the semen quality. On the other hand, standard PCA does not have this capability, hence it is not considered in this study. It has been shown that ASCA discriminates between levels of factors better compared to ANOVA–PCA, but it is prone to overfit the level differences [46]. Since ASCA is a betterknown method in the chemometric community, it is used in this study. 4.1. ANOVA-simultaneous component analysis (ASCA) ASCA [42,46] combines the statistical advantages of ANOVA and the multivariate perspective of PCA for studying co-variation among variables. It decomposes a data matrix into additive matrices of the same size, characterizing single factors of the experimental design and the residual error. ASCA allows the comparison of the means for any factor against the residual experimental error. It shows whether the means (i.e., levels for a factor) differ significantly with respect to the reproducibility of the measurement. To describe the theory of ASCA an experimental design is considered with m dependent variables measured for n observations, and two factors (independent variables), α and β, having an interaction (αβ). The design results in a data matrix X of dimension N × M. In the ASCA model, X is split into the matrix of overall/grand mean X, effect matrices Xα and Xβ, containing the level averages for each factor, and a matrix X(αβ) that describes the interaction effect between the two factors. The variation that cannot be explained by the model is represented in the residual matrix E. The data matrix is thus decomposed as follows: X ¼ X þ Xα þ Xβ þ XðαβÞ þ E
ð2Þ
The X represents the overall/grand mean of X, which is constructed from averages of the data for each variable. Each row of Xα comprises the average for the corresponding level of factor α. E.g. for a two-level design in which the odd and even rows corresponded to the different levels, the odd rows of Xα comprise the average over measurements
223
for odd rows of X and, likewise, the even rows contain the average over measurements for even rows of X. A similar computation is performed for factor β to obtain Xβ. Xαβ comprises the average for each sample (a row in the matrix) with respect to the two factors. For a two-level design there are 4 averages forming n rows of Xαβ . E is computed by subtracting X, X α, Xβ, and Xαβ from X. ASCA applies PCA to each of the effect matrices, where for each effect matrix ASCA decomposition can be written as T
Xk ¼ Tk Pk k∈fα; β; αβg
ð3Þ
where Tk and Pk are the scores and loadings for the effect matrix k. Note that the loading are orthonormal. The goal in ASCA is to estimate a Pk in a manner that the variation between level averages for each factor is maximized. The PCA models only explain the variation between level averages, where the variation within levels is not explained. To estimate the variation within levels in the loading space, the effect matrix plus the residual matrix (Xk + E) is projected onto the loadings Pk of the PCA model for Xk. The resulting projections Yk contain the variation within levels in the principal component subspace of the factor k, Yk ¼ ðXk þ EÞPk k∈fα; β; αβg:
ð4Þ
Data matrices should be mean-centered (to remove offsets) or autoscaled (to remove offset and have unit variance for each variable) before applying PCA. 5. Results and discussion 5.1. Forward and side scatter The size of the cells is represented by FS, where larger cells show higher values compared to small cells; whereas the internal cell complexity/granulation is represented by SS, where cells with high complexity show higher values of SS compared to cells with low complexity. These two types of scattering indicate physical characteristics of the cells; cells with the similar characteristics form groups in a scatter plot. Fig. 3 shows the results of OPTICS on the scatter data of all 25 measurements used in selecting the spermatozoa. Here we set MinPts to 2500 (1% of the number of cells in the dataset) to assure no detail was neglected. No data normalization was applied since the scatter detection across measurements for FS and SS were negligible. Fig. 3a shows the scatter plot of the original data, while Fig. 3b shows the reachability plot obtained from OPTICS. In the reachability plot the areas with low values of RD show potential clusters, whereas areas with the high values indicate noise. There are different strategies of extracting the proper clusters from the reachability plots [34,36]; here we applied a single-value threshold (Fig. 3c–d) since all the clusters have similar densities. The cells with a RD lower than the threshold were considered as members of the corresponding cluster, while the cells with a RD higher than the threshold were considered noise. The blue cluster (Fig. 3) identifies spermatozoa while the other two clusters show non-sperm cells where the red cluster contains smaller particles with low internal complexity and the green cluster contains larger cells with very high internal complexity. After the initial identification of the sperm population, this index was applied to select only spermatozoa for analysis of fluorescence. However, the identified clusters might be useful to compare the composition of cell subpopulation among the individual boars (this information is not often used in the data analysis). To illustrate this point, the number of cells in each cluster for each measurement was counted and plotted in a three-dimensional plot (Fig. 4a). Number of spermatozoa for each boar type is shown in Fig. 4b.
224
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
Fig. 3. Results of OPTICS on the scatter data, (a) scatter data, (b) reachability plot resulted from OPTICS (MinPts = 2500), (c) clustered scatter data resulted from (d) reachability plot with threshold applied. Blue cluster contains the spermatozoa while the other clusters contain non-sperm events.
Fig. 4. Three-dimensional plot of cell count in each cluster for each boar (a); box-plot of spermatozoa count for each boar type (b).
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
Fig. 4a shows a separation between individual boars. Boar 2 had the highest number of spermatozoa and the lowest level of debris in the ejaculates in comparison to other boars (see also Fig. 4b). Boar 4 had the highest number of cells in Cluster 2 (a non-sperm cluster). It is unknown exactly what kind of semen cells this cluster represents. It seems however that this is a specific characteristic of this boar. Boar 5 had the second highest number of spermatozoa. Boar 3 had the highest number of cells in Cluster 3 (a non-sperm cluster), and a low number of cells in the other two clusters. Measurements from boar 1 showed an unclear pattern because of high variability in spermatozoa counts. Two-way ANOVA was applied to the data of spermatozoa counts to see the effect of boar type and seasonal change on the number of spermatozoa. Boar type was found significant (p-value of 0.002) while seasonal change was found statistically insignificant (p-value of 0.198). 5.2. Fluorescence data 5.2.1. Cell identification Unlike cell identification by FS and SS, fluorescence emission shows considerable shift in detection across measurements and therefore
225
populations require normalization. Fig. 5 shows the difference between the original raw data and the normalized semen measurements for the viability attribute using the automated method described in the data normalization Section (3.1) of this paper. It can be observed in Fig. 5 that the normalization gave denser clusters and a sharper separation among the clusters. The number of clusters (the k-means input) was set to 3 instead of 4 which seems a reasonable choice (the fourth potential cluster is located around PI value of approximately 4.5 and SYBR14 value of 4.5). One disadvantage of k-means is that it might lead to erroneous results when sizes of clusters are very different, as is the case here. This means that the size of the fourth cluster is much smaller than the other three clusters. k-Means tends to split one of the large clusters into two clusters to form a fourth cluster, instead of finding the fourth natural cluster. It can be seen that normalization has improved the quality of the data. It is however interesting to see how the improvement affects the results of clustering. For this purpose, both raw and normalized data for the viability attribute were clustered using OPTICS. The thresholds of the reachability plots were set manually by an expert.
Fig. 5. Raw and normalized fluorescence data for viability attribute: (a) scatter plot of raw data, (b) bivariate histogram of raw data, (c) scatter plot of normalized data, (d) bivariate histogram of normalized data.
226
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
Fig. 6. OPTICS clustering of raw and normalized fluorescence data for the viability attribute: (a) clustered raw data, (b) reachability plot for raw data, (c) clustered normalized data, (d) reachability plot for normalized data.
The results in Fig. 6 show that clustering of the normalized data is easier than the raw data. The reachability plot shows three clear clusters with similar densities plus a small low-density cluster (shown in red; note that the black dots in the clustered data are considered noise). The observed clusters for the normalized data match the biological interpretations of the cell responses. On the other hand, clustering of the raw data is more problematic because in the reachability plot at least one non-natural cluster is detected (Fig. 6b). This was found not because of a distinct biological response but because of a shift across measurements. Comparison of corresponding clusters in the two reachability plots (Fig. 6b and d) shows that for the normalized data, the clusters are denser and the separation among clusters is more defined. The normalization was also applied to the attribute data of acrosome integrity, membrane fluidity, and ROS production (Fig. 7) individually (meaning independent normalization was applied for each attribute) that resulted in improved data quality to enable a more robust clustering of sperm function for further analysis. OPTICS was also applied to normalized data of acrosome integrity, membrane fluidity, and ROS production and they were clustered individually. The number of cluster was four for viability (live, dead, moribund, and debris), three for acrosome integrity
(live, damaged, and dead), two for membrane fluidity (fluid and non-fluid), and two for ROS production (live and dead). 5.2.2. Comparison of clustering performance SWIFT was also applied to the concatenated data of all the measurements for each attribute. The input cluster numbers were set the same as numbers detected in OPTICS (four for viability, three for acrosome integrity, two for membrane fluidity, and two for ROS production) to have a fair comparison between SWIFT and OPTICS (note that SWIFT was applied to the non-normalized data). The clustering results of our suggested methodology (automatic normalization and OPTICS) were then compared to the results of SWIFT using the F-measure (Table 1). Manual gating was done on each measurement individually to predefine the classes to enable calculation of F-measure (F-measure was computed based on the formula that was introduced in Section 3.5.). Overall, both methods have performed well since they have resulted in high values of F-measure. The average of F-measure was also calculated for each attribute (last row), showing a slightly better performance for the combination of the normalization and OPTICS, especially for the Viability attribute. It is important to note that SWIFT has performed poorly in clustering three
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
227
Fig. 7. Raw and normalized fluorescence data of semen for acrosome, membrane fluidity, and ROS attributes illustrated through bivariate histograms: (a) raw and (b) normalized data for acrosome integrity, (c) raw and (d) normalized data for membrane fluidity, (e) raw and (f) normalized data for ROS production.
measurements (nos. 3, 8, and 18) for the viability attribute. All three measurements suffered from severe shifts compared to other measurements. It is interesting to know that the nonnatural cluster shown in Fig. 6a and b in fact represents measurement no. 3. On the other hand, combination of the normalization and OPTICS not only improved the clustering performance for most of the measurement, it was also capable of handling shifts in the data.
5.2.3. Analysis of variance by ASCA As it was mentioned in Section 5.2.1, OPTICS resulted in the identification of 11 clusters in all four attributes. Cells in each cluster for each measurement was counted that gave a data matrix Xcount of size 25 (samples) by 11 (variables), representing 25 measurements and 11 sperm cell responses. ASCA was applied to the data matrix to evaluate the effect of boar type and seasonal change on the quality of boar semen. Boar type effect explained about 41% of the variation in the data, while only 18% of the variation was explained by the effect seasonal change. The rest of the variation was associated with the residuals. Fig. 8a shows the PC1 vs. PC2 score plot for the ASCA models fitted to the effect matrix of boar type factor (the matrix was mean centered). Fig. 8b shows the corresponding PC1 vs. PC2 loading plot for the model.
It is observed in Fig. 8a that there is a clear grouping of samples belonging to each boar. According to Fig. 8b, large positive score values for PC1 in the boar effect model (explaining about 90% of variation) showed a high number of viable spermatozoa, while large negative values showed a high number of dead spermatozoa. Therefore, the PC1 in the boar effect model turns to be an appropriate measure to evaluate the quality of spermatozoa (PC2 mainly explains the variations between moribund and debris cells). Based on PC1, boars 2 and 5 had the highest sperm quality among all five boars, whereas boars 3 and 4 had the least favorable quality. Samples from boar 1 showed an unclear pattern because of high variability in the sperm quality, which is the same boar that showed high variability in spermatozoa counts (see Fig. 4). The results shown in Fig. 8 are in accordance with the results shown in Fig. 4, meaning that boar breeds that had higher counts of spermatozoa in the semen and ejaculate had also higher quality of spermatozoa. Fig. 9a illustrates the PC1 vs. PC2 score plot for the ASCA models fitted to the effect matrix of date factor (the matrix was mean centered). Fig. 9b shows the corresponding PC1 vs. PC2 loading plot for the model. Fig. 9a illustrates that there is no clear grouping (except for Date 3) among the samples based on date effect. It is clear that the samples for Date 3 have different characteristics compared to the samples for other dates. According to Fig. 9b, a combination of both
228
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
Table 1 F-measure computed for clustering results of SWIFT and the combination multi-channel normalization and OPTICS. F-measure was calculated based on predefined classes performed by manual gating of the data. F-measure Membrane fluidity
Acrosome
ROS
Viability
Ejaculate/measurement
OPTICS
SWIFT
OPTICS
SWIFT
OPTICS
SWIFT
OPTICS
SWIFT
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Mean
0.95 0.97 0.99 0.99 0.99 0.98 0.96 0.98 0.99 0.99 0.98 0.98 0.98 0.98 0.98 0.97 0.97 0.97 0.97 0.97 0.99 0.96 0.98 0.98 0.98 0.98
0.93 0.94 0.97 0.98 0.98 0.97 0.95 0.98 0.98 0.99 0.97 0.96 0.97 0.97 0.97 0.97 0.96 0.97 0.97 0.98 0.97 0.95 0.96 0.97 0.97 0.97
0.97 0.98 0.99 0.98 0.95 0.98 0.98 0.99 0.99 0.97 0.98 0.98 0.98 0.98 0.97 0.98 0.98 0.99 0.97 0.97 0.98 0.95 0.99 0.98 0.96 0.98
0.97 0.97 0.98 0.97 0.95 0.95 0.96 0.96 0.98 0.96 0.97 0.97 0.96 0.97 0.96 0.95 0.96 0.98 0.94 0.94 0.97 0.96 0.97 0.97 0.96 0.96
0.96 0.99 0.99 1.00 0.99 1.00 0.99 0.99 1.00 1.00 0.99 0.99 0.99 1.00 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99
0.94 0.95 0.97 0.98 0.97 0.98 0.98 0.98 0.98 0.98 0.96 0.96 0.96 0.96 0.97 0.97 0.97 0.97 0.96 0.96 0.98 0.98 0.98 0.98 0.98 0.97
0.96 0.98 0.99 0.98 0.97 0.99 0.98 0.98 0.98 0.96 0.97 0.96 0.99 0.97 0.97 0.94 0.98 1.00 0.98 0.96 0.99 0.98 0.99 0.98 0.98 0.98
0.92 0.92 0.77 0.96 0.93 0.94 0.95 0.86 0.96 0.96 0.94 0.91 0.91 0.90 0.95 0.93 0.95 0.75 0.94 0.96 0.98 0.97 0.95 0.96 0.95 0.93
F-measure is calculated based on predefined classes created by experts thorough manual gating. Bold values show F-measures that are unexpectedly too low. Mean shows the average for each attribute. OPTICS was applied to the normalized data, while SWIFT was applied to the data that was not normalized.
PCs should be used to evaluate the quality of spermatozoa. It seems that the top-right corner of the plot represents the high quality of spermatozoa, while the bottom-left corner of the plot represents the low quality. Based on this observation, it seems that Date 1 and Date 2 show slightly lower quality of spermatozoa compared to Dates 4 and 5, while Date 4 stands in the middle. Permutations were applied to obtain p-values for main effects [46]. The results show that the boar effect on the quality of spermatozoa is significant (p-value of 0.006), while the date effect is not significant (p-value of 0.087). This shows that the seasonal change does not affect
the quality of spermatozoa in boars. p-Value for the interaction effect could not be calculated since there was no repetition in the measurements. It is therefore assumed that the two factors do not have a significant interaction. 6. Conclusions This study presents a comprehensive method for analyzing FCM data to overcome some challenges that this type of data typically presents. The methodology includes an automated data normalization algorithm
Fig. 8. PC1 vs. PC2 score plot for the ASCA models fitted to the effect matrix (cell counts for all measurements for all four attributes) of boar type factor (mean centered) (a); Corresponding loading plot for the model.
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
229
Fig. 9. PC1 vs. PC2 score plot for the ASCA models fitted to the effect matrix (cell counts for all measurements for all four attributes) of date factor (mean centered) (a); Corresponding PC1 vs. PC2 loading plot for the model.
to minimize non-biological/technical variations, accompanied by a density-based clustering method for efficient identification of cell populations. The clustering result of the suggested methodology was compared with the results of an automated clustering method. The comparison showed that the suggested methodology in this case performed better. ASCA as a multivariate data analysis method was also applied for post-analysis of the identified populations to examine the effect of boar type and seasonal change on the quality of boar spermatozoa. The applicability of this methodology was demonstrated using FCM datasets of sperm cell function analyses, containing scatter and fluorescence data of four different attributes. The results show that the approach can effectively detect systematic variations based on individual and seasonal change effect, without subjective bias, thus this suggested methodology facilitates biological description of the measurements in a more objective and interpretable manner. There are several opportunities for future research. The first one could be the implementation of automated thresholding of the OPTICS reachability plot. The second one could be replacing k-means clustering in the data normalization algorithm by an automated density-based clustering technique in order to improve the performance of data normalization in FCM. Conflict of interest The authors declare that there are no conflicts of interest. Acknowledgement Hamid Babamoradi would like to acknowledge CHANCE (www. chance.life.ku.dk) for financial support. The data originate from Mr Malcolm Hill, Honours project funded by Pork CRC Ltd, Australia. Sample collection was done by Premier Pig Genetics, Wacol Pig AB Centre, Australia. References [1] A. Maftah, O. Huet, P. Gallet, M. Ratinaud, Flow cytometrys contribution to the measurement of cell functions, Biol. Cell. 78 (1993) 85–93. [2] I. Vermes, C. Haanen, C. Reutelingsperger, Flow cytometry of apoptotic cell death, J. Immunol. Methods 243 (2000) 167–190. [3] H.A. Crissman, Z. Darzynkiewicz, R.A. Tobey, J.A. Steinkamp, Correlated measurements of DNA, RNA, and protein in individual cells by flow-cytometry, Science 228 (1985) 1321–1324.
[4] M. Loken, J. Brosnan, B. Bach, K. Ault, Establishing optimal lymphocyte gates for immunophenotyping by flow-cytometry, Cytometry 11 (1990) 453–459. [5] N. Aghaeepour, G. Finak, H. Hoos, T.R. Mosmann, R. Brinkman, R. Gottardo, R.H. Scheuermann, FlowCAP Consortium, DREAM consortium, critical assessment of automated flow cytometry data analysis techniques, Nat. Methods 10 (2013) 445445. [6] E. Lugli, M. Roederer, A. Cossarizza, Data analysis in flow cytometry: the future just started, Cytometry A 77A (2010) 705–713. [7] A. Bashashati, R.R. Brinkman, A survey of flow cytometry data analysis methods, Adv. Bioinforma. 2009 (2009) 1–19. [8] K. Crotty, R. May, A. Kulvicki, D. Kumar, D. Neal, The effect of antimicrobial therapy on testicular aspirate flow-cytometry, J. Urol. 153 (1995) 835–838. [9] M. Joseph, D. Adams, J. Maragos, K. Saving, Flow cytometry of neonatal platelet RNA, J. Pediatr. Hematol. Oncol. 18 (1996) 277–281. [10] L. Tubman, Z. Brink, T. Suh, G. Seidel, Characteristics of calves produced with sperm sexed by flow cytometry/cell sorting, J. Anim. Sci. 82 (2004) 1029–1036. [11] E. Grimaldi, P. Carandente, F. Scopacasa, M. Romano, M. Pellegrino, R. Bisogni, M. De Caterina, Evaluation of the monocyte counting by two automated haematology analysers compared with flow cytometry, Clin. Lab. Haematol. 27 (2005) 91–97. [12] R. Mann, On multiparameter data-analysis in flow-cytometry, Cytometry 8 (1987) 184–189. [13] Y. Kosugy, R. Sato, S. Genka, N. Shitara, K. Takakura, An interactive multivariateanalysis of FCM data, Cytometry 9 (1988) 405–408. [14] H. Davey, A. Jones, A. Shaw, D. Kell, Variable selection and multivariate methods for the identification of microorganisms by flow cytometry, Cytometry 35 (1999) 162–168. [15] P. Wikstrom, T. Johansson, S. Lundstedt, L. Hagglund, M. Forsman, Phenotypic biomonitoring using multivariate flow cytometric analysis of multi-stained microorganisms, FEMS Microbiol. Ecol. 34 (2001) 187–196. [16] A.C. Eriksson, L. Jonasson, T.L. Lindahl, B. Hedback, P.A. Whiss, Static platelet adhesion, flow cytometry and serum TXB2 levels for monitoring platelet inhibiting treatment with ASA and clopidogrel in coronary artery disease: a randomised cross-over study, J. Transl. Med. 7 (2009) 42. [17] T.W. Anderson, An Introduction to Multivariate Statistical Analysis, Wiley, Hoboken, N.J., 2003 [18] M.S. Hossain, A. Johannisson, M. Wallgren, S. Nagy, A.P. Siqueira, H. RodriguezMartinez, Flow cytometry for the assessment of animal sperm integrity and functionality: state of the art, Asian J. Androl. 13 (2011) 406–419. [19] A.M. Petrunkina, R.A.P. Harrison, Cytometric solutions in veterinary andrology: developments, advantages, and limitations, Cytometry A 79A (2011) 338–348. [20] R. Davis, R. Siemers, Derivation and reliability of kinematic measures of sperm motion, Reprod. Fertil. Dev. 7 (1995) 857–868. [21] D. Katz, J. Overstreet, Sperm motility assessment by videomicrography, Fertil. Steril. 35 (1981) 188–193. [22] T. Abaigar, W.V. Holt, R.A.P. Harrison, G. del Barrio, Sperm subpopulations in boar (Sus scrofa) and gazelle (Gazella dama mhorr) semen as revealed by pattern analysis of computer-assisted motility assessments, Biol. Reprod. 60 (1999) 32–41. [23] T. Abaigar, M. Cano, A. Pickard, W. Holt, Use of computer-assisted sperm motility assessment and multivariate pattern analysis to characterize ejaculate quality in Mohor gazelles (Gazella dama mhorr): effects of body weight, electroejaculation technique and short-term semen storage, Reproduction 122 (2001) 265–273. [24] N. Satake, R. Elliott, P. Watson, W. Holt, Sperm selection and competition in pigs may be mediated by the differential motility activation and suppression of sperm subpopulations within the oviduct, J. Exp. Biol. 209 (2006) 1560–1572.
230
H. Babamoradi et al. / Chemometrics and Intelligent Laboratory Systems 142 (2015) 219–230
[25] F. Hahne, A.H. Khodabakhshi, A. Bashashati, C. Wong, R.D. Gascoyne, A.P. Weng, V. Seyfert-Margolis, K. Bourcier, A. Asare, T. Lumley, R. Gentleman, R.R. Brinkman, Per-channel basis normalization methods for flow cytometry data, Cytometry A 77A (2010) 121–131. [26] H.T. Maecker, J.P. McCoy, R. Nussenblatt, Standardizing immunophenotyping for the Human Immunology Project, Nat. Rev. Immunol. 12 (2012) 191–200. [27] N. Le Meur, A. Rossini, M. Gasparetto, C. Smith, R.R. Brinkman, R. Gentleman, Data quality assessment of ungated flow cytometry data in high throughput experiments, Cytometry A 71A (2007) 393–403. [28] T.A. Oldaker, Quality control in clinical flow cytometry, Clin. Lab. Med. 27 (2007) 671+. [29] D. Garner, L. Johnson, Viability assessment of mammalian sperm using Sybr-14 and propidium iodide, Biol. Reprod. 53 (1995) 276–284. [30] G.A.F. Seber, Multivariate Observations, Wiley, New York, 1984. [31] G.H. Golub, C.F. Van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore; London, 1989. [32] J.V. Watson, Flow Cytometry Data Analysis, Cambridge University Press, Cambridge, 1992. [33] R. Mann, D. Popp, R. Hand, The use of projections for dimensionality reduction of flow cytometric data, Cytometry 5 (1984) 304–307. [34] M. Ankerst, M.M. Breunig, H.-. Kriegel, J. Sander, OPTICS: ordering points to identify the clustering structure, SIGMOD Rec, 281999. 49–60. [35] M. Daszykowski, B. Walczak, D. Massart, Looking for natural patterns in analytical data. 2. Tracing local density with OPTICS, J. Chem. Inf. Comput. Sci. 42 (2002) 500–507. [36] J. Sander, X. Qin, Z. Lu, N. Niu, A. Kovarsky, Automatic extraction of clusters from hierarchical clustering representations, Lect. Notes Artif. Intell. 2637 (2003) 75–87. [37] M.M. Breunig, H. Kriegel, J. Sander, Fast hierarchical clustering based on compressed data and OPTICS, Lect. Notes Comput. Sci 1910 (2000) 232–242.
[38] M. Daszykowski, B. Walczak, Density-based clustering methods, in: R. Tauler, S. Brown, B. Walczak (Eds.), Comprehensive Chemometrics: Chemical and Biochemical Data Analysis, Elsevier, 2009, pp. 635–654. [39] G. Finak, J. Perez, A. Weng, R. Gottardo, Optimizing transformations for automated, high throughput analysis of flow cytometry data, BMC Bioinforma. 11 (2010) 546. [40] I. Naim, S. Datta, G. Sharma, J.S. Cavenaugh, T.R. Mosmann, SWIFT: scalable weighted iterative sampling for flow cytometry clustering, IEEE (2010) 509–512. [41] S. Mayo, D. Acevedo, C. Quinones-Torrelo, I. Canos, M. Sancho, Clinical laboratory automated urinalysis: Comparison among automated microscopy, flow cytometry, two test strips analyzers, and manual microscopic examination of the urine sediments, J. Clin. Lab. Anal. 22 (2008) 262–270. [42] A. Smilde, J. Jansen, H. Hoefsloot, R. Lamers, J. van der Greef, M. Timmerman, ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data, Bioinformatics 21 (2005) 3043–3048. [43] P. Harrington, N. Vieira, J. Espinoza, J. Nien, R. Romero, A. Yergey, Analysis of variance–principal component analysis: a soft tool for proteomic discovery, Anal. Chim. Acta 544 (2005) 118–127. [44] H. Zhang, L.O. Cardell, J. Bjorkander, M. Benson, H. Wang, Comprehensive profiling of peripheral immune cells and subsets in patients with intermittent allergic rhinitis compared to healthy controls and after treatment with glucocorticoids, Inflammation 36 (2013) 821–829. [45] J. Toedling, P. Rhein, R. Ratei, L. Karawajew, R. Spang, Automated in-silico detection of cell populations in flow cytometry readouts and its application to leukemia disease monitoring, BMC Bioinforma. 7 (2006) 282. [46] G. Zwanenburg, H.C.J. Hoefsloot, J.A. Westerhuis, J.J. Jansen, A.K. Smilde, ANOVA– principal component analysis and ANOVA–simultaneous component analysis: a comparison, J. Chemom. 25 (2011) 561–567.