Food Research International 52 (2013) 185–197
Contents lists available at SciVerse ScienceDirect
Food Research International journal homepage: www.elsevier.com/locate/foodres
Classification of pig fat samples from different subcutaneous layers by means of fast and non-destructive analytical techniques Giorgia Foca a, b,⁎, Davide Salvo a, Adelaide Cino a, Carlotta Ferrari a, Domenico Pietro Lo Fiego a, b, Giovanna Minelli a, b, Alessandro Ulrici a, b a
Department of Life Sciences, University of Modena and Reggio Emilia, Padiglione, Besta, Via Amendola 2, 42122 Reggio Emilia, Italy Interdipartimental Research Centre for Agri-Food Biological Resources Improvement and Valorisation, University of Modena and Reggio Emilia, Padiglione, Besta, Via Amendola 2, 42122 Reggio Emilia, Italy
b
a r t i c l e
i n f o
Article history: Received 23 January 2013 Accepted 11 March 2013 Keywords: Pig fat Multivariate classification Variable selection Tristimulus colorimetry FT-NIR spectroscopy Hyperspectral imaging
a b s t r a c t In the meat industry the fat portions coming from two different subcutaneous layers, i.e., inner and outer, are destined to the manufacturing of different products, hence the availability of cheap, rapid and affordable methods for the characterization of the overall fat quality is desirable. In this work the potential usefulness of three techniques, i.e. tristimulus colorimetry, FT-NIR spectroscopy and NIR hyperspectral imaging, were tested to rapidly discriminate fat samples coming from the two different layers. To this aim, various multivariate classification methods were used, also including signal processing and feature selection techniques. The classification efficiency in prediction obtained using colorimetric data did not reach excellent results (78.1%); conversely, the NIR-based spectroscopic methods gave much more satisfactory models, since they allowed to reach a prediction efficiency higher than 95%. In general, the samples of the outer layer showed a high degree of variability with respect to the samples of the inner layer. This is probably due to a greater variability of the outer samples in terms of fatty acid composition and water amount. © 2013 Elsevier Ltd. All rights reserved.
1. Introduction The technological quality of the subcutaneous adipose tissue of pigs is mainly represented by its consistency and its resistance against oxidative processes, that render it suitable for processing and storage. Abbreviations: coif: wavelet of the “coiflets” family; CV: Cross validation; d1: First order derivative pretreatment; d2: Second order derivative pretreatment; db: wavelet of the “daubechies” family; det1: Linear detrend pretreatment; det2: Quadratic detrend pretreatment; EFF: Efficiency %; FOP: Fiber Optic Probe; FT-NIR: Fourier Transform-Near InfraRed; HSI: HyperSpectral Imaging; In: Class corresponding to the inner layer samples; In_low: Samples of class In measured on the lower face of the disk; In_up: Samples of class In measured on the upper face of the disk; iPLS-DA: Interval Partial Least Squares-Discriminant Analysis; IS: Integrating Sphere; LV: Latent Variable; m: Meancentering pretreatment; MSC: Multiplicative Scatter Correction pretreatment; N: None pretreatment (raw spectra); NIR: Near InfraRed; Out: Class corresponding to the outer layer samples; Out_low: Samples of class Out measured on the lower face of the disk; Out_up: Samples of class Out measured on the upper face of the disk; PC: Principal Component; PCA: Principal Component Analysis; PLS-DA: Partial Least Squares-Discriminant Analysis; Q–T2: Q residuals versus Hotelling's T2; RMSECV: Root Mean Square Error in Cross Validation; ROI: Region Of Interest; S: Smoothing pretreatment; SNV: Standard Normal Variate pretreatment; S/N: Signal to noise; sym: wavelet of the “symlets” family; TRN: Training set (including only Out_up and In_low measurements); TST1: First test set (including only Out_up and In_low measurements); TST2: Second test set (including only Out_low and In_up measurements); VIP: Variable Importance in Projection; WPT: Wavelet Packet Transform; WPTER: Wavelet Packet Transform for Efficient pattern Recognition. ⁎ Corresponding author at: Department of Life Sciences, University of Modena and Reggio Emilia, Padiglione Besta, Via Amendola 2, 42122 Reggio Emilia, Italy. Tel.: +39 0522 522042; fax: +39 0522 522027. E-mail address:
[email protected] (G. Foca). 0963-9969/$ – see front matter © 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.foodres.2013.03.022
The consistency is closely correlated to the lipid and water content, to the texture of the connective tissue and to the nature of the fatty acids which constitute the lipids. A low lipid content associated with a high water content leads to a poor consistency of the adipose tissue (Lebret & Mourot, 1988). Also the fatty acid composition exerts a key role in determining the consistency of the adipose tissue: a higher degree of unsaturation corresponds to a lower melting point of the fat and, consequently, to a lower consistency (Enser, 1983). Furthermore, an adipose tissue excessively rich in unsaturated fatty acids is certainly positive from the nutritional point of view, but it can create serious problems from the technological point of view, since it can easily undergo hydrolytic and oxidative phenomena during manufacturing (Lo Fiego, 1996; Wood et al., 2008). As pointed out by Lebret and Mourot (1988), a high organoleptic and technologic quality is not generally associated to a high content of polyunsaturated fatty acids. A very peculiar aspect of the pig fat covering tissue is that it is constituted by two or more layers having different composition and coming from different metabolic pathways (Mersmann & Leymaster, 1984). The outer layer, close to the rind, presents a greater consistency and is more rich in unsaturated fatty acids with respect to the inner layer (Malmfors, Lundstrom, & Hansson, 1978). The greater consistency of the outer layer, despite the higher degree of unsaturation, is probably due to a greater collagen content, i.e. of connective tissue, and to a greater organization of the connective structure surrounding the adipocytes (Lo Fiego, Tedeschi, Santoro, & Nanni Costa, 1987).
186
G. Foca et al. / Food Research International 52 (2013) 185–197
The stratification of the pig fat in different layers plays an important role in the Italian industry, in fact the processing industry makes use of the outer fat, considered more hard, in the form of cubes for the production of salami and sausages, while the inner layer tissue, classified as soft fat, is mainly destined to the melting (Santoro, 1983). From the analytical point of view, the fat quality is generally estimated by means of chemical analyses, i.e. redox titrations to determine the iodine value, that is used to assess the amount of unsaturation in fatty acids, and gas chromatography for the determination of the fatty acid composition. However, these common methods are expensive, timeconsuming, detrimental to the environment because of the use of chemical reagents, and they are not suitable to be used to follow an industrial process in real-time. Therefore, the aim of the present work is to verify the suitability of three fast and non-destructive methods, i.e. tristimulus colorimetry, Fourier Transform-Near InfraRed (FT-NIR) spectroscopy and NIR HyperSpectral Imaging (HSI) to discriminate between fat samples coming from two different subcutaneous layers. The choice to test these methods is due to the fact that they are particularly flexible, in fact they can supply qualitative and, in some cases, quantitative information with minimal or no sample preparation, which makes them suitable for on-line applications. Colorimetric measurements have been recently used in the literature to investigate possible relationships between fat color and fatty acid composition (Wood et al., 2003) which is often variable, depending on genotype and sex of the swine and on the rearing system. Carrapiso and Garcia (2005) proved that CIE L*a*b* variables of subcutaneous fat were closely related to fatty acid composition. In particular, the largest co-relationships involve L* which is negatively related to most unsaturated fatty acids and positively to the most abundant saturated fatty acids, while Gandemer (2002) found larger whiteness and pinkness in firm fat than in low consistency fat. The potential of NIR spectroscopy for predicting the fatty acid composition of fat samples is well known, in fact a number of papers have been published about this topic. Among them, some works are based on NIR measurements acquired by means of a fiber optic probe (Gonzalez-Martin, Gonzalez-Perez, Hernandez-Mendez, & AlvarezGarcia, 2003; Pérez-Marín, De Pedro Sanz, Guerrero-Ginel, & GarridoVaro, 2009), while other works made use of diffuse reflectance, transmission and transflectance measurements (Gjerlaug-Enger, Kongsro, Aass, Ødegard, & Vangen, 2011; Müller & Scheeder, 2008; Ripoche & Guillard, 2001), all of them generally reporting good results. The paper by Pérez-Juan et al. (2010) merits a particular mention for our purposes since they used NIR spectroscopy to analyze the fatty acid composition at two different locations of the subcutaneous fat. As for hyperspectral imaging, that is an emerging technique, at present only a limited number of works have been published concerning the analysis of fat (for instance, Kobayashi, Matsui, Maebuchi, Toyota, & Nakauchi, 2010 on beef samples and Kamruzzaman, ElMasry, Sun, & Allen, 2011 on lamb samples). In the context of swine products a single paper does exist (O'Farrell, Wold, Høy, Tschudi, & Schulerud, 2010), aimed to the on-line quantification of the fat amount in inhomogeneous pork trimmings. HSI simultaneously collects the spectral and spatial information on a sample to compose a visual image of components distribution. This kind of representation enables the characterization of complex heterogeneous samples by looking at the spatial features and allows the identification of a wide range of surface constituents by looking at the spectral features (Gowen, O' Donnell, Cullen, Downey, & Frias, 2007). In particular, HSI is useful not only to define what chemical species are present in the sample and how much of each is present, but also principally to indicate where they are located. Moreover, it is useful whenever a fast and non-destructive technique is needed to characterize a sample and to obtain a visual representation of the analysis. The datasets acquired with the different analytical techniques have been initially processed by means of Principal Component Analysis (PCA), that was used as an explorative tool to detect possible outlier
measurements, then they were subjected to classification analysis. Firstly the classification has been performed on the whole original data using the Partial Least Squares-Discriminant Analysis (PLS-DA) algorithm, and then the NIR-based datasets were also subjected to variable selection before building the classification models. Two variable selection methods were used, i.e. iPLS-DA, a local modeling procedure based on PLS-DA, and WPTER (Wavelet Packet Transform for Efficient pattern Recognition), a wavelet-based feature selection algorithm (Antonelli et al., 2004; Cocchi et al., 2004, 2005). The obtained classification models have been compared, both in order to evaluate the predictive ability of the models obtained using different analytical techniques, and to interpret the chemical meaning of the choices made by the tested algorithms. Particular attention has been paid to the samples that were incorrectly classified in order to understand the reasons for their misclassification. 2. Experimental 2.1. Samples Sixty-six pigs from Italian Landrace × Large White crossbreeds provided 205 samples of fat tissue by means of the following sampling procedure. The subcutaneous adipose tissue was hand-slashed by an expert operator at the last rib level in a way to obtain disks of fat tissue having diameter of about 3 cm and thickness ranging from 3 mm to 2 cm. These fat samples consisted in two adjacent layers, lying at different depths with respect to the rind. The layer close to the rind (that was previously removed) was labeled as Outer (Out) and the layer far from the rind as Inner (In). The two layers were then separated by means of a manual cut, after a visual assessment of the line of demarcation of the layers, to gain the corresponding Out and In samples. It has to be noticed that for each pig specimen from 1 to 4 Out and In samples have been collected; this means that for each original fat disk the Out and In parts were not necessarily all kept. To keep track of the sampling point where the following analytical measurements were carried out on the fat disk, an additional labeling was introduced: Out_up for the upper face of the Out layer, Out_low for the lower face of the Out layer, In_up for the upper face of the In layer and In_low for the lower face of the In layer. The procedure adopted to delimit and label the fat samples is represented in Scheme 1. All the collected samples were stored in dark conditions at −20 °C. Before analysis, the samples were slowly defrosted at 4 °C for 1 h and then at room temperature for other 30 min. All the measurements were performed at room temperature. Each day, only the samples to be analyzed were defrosted and analyzed following a random order. Then, this order was shuffled, repeated measurements were performed on each sample and, at the end of the daily measurement session, the samples were stored again at −20 °C. 2.2. Sample preparation and analysis 2.2.1. Tristimulus colorimetry Colorimetric measurements were accomplished using a Chroma Meter CR-400 Konica Minolta (CIE standard illuminant D65) tristimulus colorimeter. The standard instrumental procedure for calibration was applied before use. For each fat disk, four measurements were performed. In particular, a total of 820 measurements (= 205 samples × 2 acquisitions on each disk face × 2 repeated measurements) has been acquired. Each recorded tern of values, consisting in CIE L*, a* and b* colorimetric parameters, is the result of the instrumental mean of three measurements. In addition, starting from the recorded L*, a* and b* values, two further colorimetric parameters, i.e. Hue and Chroma, have been calculated using the following equations: Hue ¼ arctanðb =aÞ
ð1Þ
G. Foca et al. / Food Research International 52 (2013) 185–197
187
Scheme 1. Sampling scheme of the fat tissue samples.
Chroma ¼
h i0:5 2 2 a þ b :
ð2Þ
2.2.2. FT-NIR spectroscopy FT-NIR measurements were performed using a Bruker Optic MPA FT-NIR spectrophotometer equipped with two different sampling tools: Integrating Sphere (IS) working in the 3800–12,500 cm −1 region and Fiber Optic Probe (FOP) working in the 4000–12,500 cm −1 region. The spectra were acquired in reflectance mode at 2 cm −1 resolution as the average of 64 scans. For each sampling tool, a total of 1640 spectra have been acquired, as the result of (205 samples) × (2 acquisitions on each disk face) × (2 acquisition sessions, i.e., replicate measurements) × (2 repeated spectra in each session). 2.2.3. Hyperspectral imaging Six Out and six In samples were randomly selected from the original samples set to be imaged by means of the HSI system. Only 12 samples out of 205 were examined using imaging since our aim was limited to evaluate the feasibility of this technique to discriminate fat samples from different subcutaneous layers, also considering that image processing by multivariate analysis demands a considerable computational effort. The used HSI system includes a desktop NIR Spectral Scanner (DV Optic) embedding a Specim N17E reflectance imaging spectrometer, coupled to a Xenics XEVA 2608 camera (320 × 256 pixels) and it covers the spectral range from 900 to 1700 nm (corresponding to the range 11,111–5882 cm −1) with a 5 nm resolution, for a total of 161 wavelengths. For each sample, the upper face as well as the lower one were imaged, providing for each image scene a data format, called hypercube, composed by 320 pixels per row and with a number of rows of about 200 (depending on the image size) with a spatial resolution of 0.47 mm. A silicon carbide (SiC) sandpaper sheet was used as the background for all the acquired images, since it is characterized by a very low and constant reflectance spectrum (Burger & Geladi, 2006). The intensity values of the raw images were then converted to the corresponding reflectance values, by applying a simple external calibration based on the high-reflectance standard reference and on the dark current measured by covering the camera lens with its cap (Burger & Geladi, 2005).
Due to the low S/N ratio of the spectra extremes, only the 149 variables between 960 and 1700 nm (corresponding to the range 10,416 ÷ 5882 cm −1) were considered for further analysis. Furthermore, in order to reduce the computational load and to have homogeneous data sizes, all the images were cropped to the common pixel size, equal to189 × 122 pixels. 2.3. Data processing and analysis 2.3.1. Explorative analysis 2.3.1.1. PCA and data organization of colorimetric and FT-NIR data. PCA models were calculated on the autoscaled colorimetric data and on the mean-centered FT-NIR data; the samples lying outside the 99.7% confidence limit in the Q–T 2 plot were identified as outlier and removed from the datasets. Each dataset was split into a training set and into two separate test sets. In particular, the measurements taken on the Out_up and the In_low faces, i.e. on the extreme faces of each fat sample, whose attribution to the correct layer is sure, were randomly split into the training set (TRN) containing about 2/3 of the objects of each class, and into the first one of the two test sets (TST1), containing the remainder 1/3 of objects. All the other measurements, acquired on the Out_low and on the In_up faces of the fat layers (i.e., on the adjacent faces of the two different layers) were included in a second test set (TST2). The manual cut of the fat disks involves a degree of uncertainty about the correct cutting position (in addition to the fact that possible changes in the fat composition of each layer within depth may exist), therefore, we preferred to avoid the inclusion of Out_low and In_up measurements in the model building phase, but to use them only for external validation. As for FT-NIR data, after outlier elimination, the IS and FOP spectra were considered for classification both taking into account the whole spectral range and a limited region of the spectra in the range 10,416–5882 cm−1 (equivalent to 960–1700 nm). This choice has been made to directly compare the results of the classification models obtained using FT-NIR data with those obtained from the hyperspectral data, that were collected in this limited spectral range. 2.3.1.2. Image cleaning by PCA on hyperspectral data. Hyperspectral images were firstly subjected to explorative analysis by means of PCA, in
188
G. Foca et al. / Food Research International 52 (2013) 185–197
order to perform an image cleaning step (Williams, Geladi, Britz, & Manley, 2012). Indeed, since the main aim of the analysis is to discriminate the two classes Out and In of the fat tissue samples, it is important to remove the other sources of variations which could affect the model, i.e., the pixels corresponding to the sandpaper sheet, which was present in all the images, but also the small rind portions possibly remained in the Out samples. To this purpose, the effectiveness of several data pretreatments such as detrend, Standard Normal Variate, first and second derivatives, normalization and meancentering, were evaluated both separately and in different combinations. In the context of PCA, the best combination of pretreatments was selected by evaluating the dispersion and the distance between the different pixel clusters in the PC1–PC2 score plot, since to obtain a reliable image cleaning it is necessary to gain a clear distinction between the fat tissue sample and the pixels corresponding to other parts of the image scene (such as background or rind residues). Subsequently, the Region Of Interest (ROI) of each image was selected by considering all the pixels retained in the cleaned image and it was assigned to the corresponding (In/Out) class. The 12 hyperspectral images acquired on the Out_up and In_low faces were randomly divided into a training and a test set, both composed by 3 images for each class. The images belonging to each set were then merged together into a unique training image (TRN image) and test image (TST1 image). The remainder 12 images, acquired on the Out_low and on the In_up faces, were merged together to create a second test set image (TST2 image). 2.3.2. PLS-DA classification models Despite the use of the additional labeling for the samples (“low” and “up”), aimed to distinguish whether the measurements were acquired on the upper or the lower face for each disk, all the classification models were computed considering only the two main classes Out and In. PLS-DA was used as classification method; for each PLS-DA model, the number of Latent Variables (LVs) was chosen on the basis of the minimum value of the Root Mean Square Error in Cross-Validation (RMSECV). In particular, a customized cross-validation (11 deletion groups) was used for the colorimetric and for the FT-NIR data, paying attention to keep the replicate measurements of each sample in the same deletion group, while for the hyperspectral data a contiguous-blocks cross-validation (10 deletion groups) was used. The performance of the classification models was expressed in terms of Efficiency % (EFF), that is defined as the geometric mean of Sensitivity (the percentage of objects of each modeled class correctly accepted by the class model) and Specificity (the percentage of objects of other classes correctly rejected by each class model) (Forina, Oliveri, Jäger, Römisch, & Smeyers-Verbeke, 2009). Concerning the used pretreatments, the dataset of colorimetric parameters was autoscaled, while for the FT-NIR spectra, 22 different combinations of pretreatments were tested, since it was not possible to establish in advance the performing better one (Rinnan, Van den Berg, & Engelsen, 2009). In particular, we considered the following pretreatments: - none (N), meancentering (m), first order derivative (d1), second order derivative (d2), linear detrend (det1), quadratic detrend (det2), smoothing (S), Standard Normal Variate (SNV), Multiplicative Scatter Correction (MSC), that were tested both separately and in the following combinations: - d1 + m, d2 + m, det1 + m, det2 + m, S + m, SNV + m, MSC + m, d1 + S + m, d2 + S + m, det1 + S + m, det2 + S + m, SNV + S + m, MSC + S + m. The hyperspectral dataset was instead analyzed using the following pretreatments: - m, det1, SNV, d1, d2, normalization (nor), det1 + m, SNV + m, d1 + m, d2 + m, nor + m.
For each dataset, the model showing the higher EFF value in cross-validation was selected. 2.3.3. Variable selection 2.3.3.1. Variable selection by means of iPLS-DA. The implementation of Interval Partial Least Squares regression for Discriminant Analysis (iPLS-DA) (Ferrari, Foca, Vignali, Tassi, & Ulrici, 2011; Leardi & Norgaard, 2004) was applied both to the hyperspectral and to the FT-NIR spectral datasets, in order to select only those intervals containing the most relevant information for classification. Essentially, iPLS-DA consists in subdividing the whole spectral range in a user-defined number of intervals of equal length, and then in selecting the intervals most useful for classification by an iterative procedure, which can follow either a forward or a reverse search strategy. Concerning the whole FT-NIR spectra, three different interval widths, i.e., 50, 100 and 200 variables were tested in both forward and reverse modes. Moreover, iPLS-DA was also applied to both the FOP and IS datasets considering only the restricted 10,416–5882 cm−1 range; in this case, only one interval width (50 variables) was considered. As for hyperspectral data, iPLS-DA was applied in the forward mode using an interval size of 7 original variables. The pretreatment method as well as the cross-validation method were the same as those previously considered for the PLS-DA models calculation on the whole spectral range. 2.3.3.2. Variable selection by means of WPTER. The aim of the WPTER algorithm essentially consists in finding a limited number of variables, called wavelet coefficients, which lead to an efficient separation among objects belonging to different classes, through the decomposition into the Wavelet Packet Transform (WPT) domain of the monodimensional signals (e.g., NIR spectra) that describe each object. In particular, the Wavelet Transform (Walczak, 2000) allows it to represent each analyzed signal in an alternative domain, where the different frequencies are separated, but maintaining at the same time the localization in the original domain. In this manner, in addition to the single intensity values, other useful aspects like peak widths, slopes of selected portions of the signal or discontinuities are also taken into account. For a detailed description of the WPTER algorithm the reader is referred to dedicated literature (Cocchi, Seeber, & Ulrici, 2001; Foca et al., 2009; Ulrici et al., 2008); here it should be only noted that a number of parameters can be set to obtain a WPTER model, such as the wavelet filter, the decomposition level in the WPT domain and the percentage of wavelet coefficients to be retained. In general, it is not possible to know in advance the optimal combination of the WPTER parameters, so it is appropriate to cycle over different possible combinations in order to select the best classification model based on the performance in cross-validation. In the present work, the WPTER algorithm was applied only to the FT-NIR datasets: 9 different wavelet filters (db1, db2, sym4, sym5, sym6, sym7, sym8, coif1, coif5) and 5 percentages of preselected wavelet coefficients (0.1%, 0.5%, 1%, 5%, 10%) were used, setting the maximum decomposition level equal to 5. The combination of all these parameters gave a total of 45 cycles of calculation, that were tested on all the four considered FT-NIR datasets (FOP whole spectrum and restricted range, IS whole spectrum and restricted range). The wavelet coefficients selected by WPTER can be used as input variables for the calculation of discriminant models. For comparison purposes, the same method adopted to classify the whole FT-NIR spectra was used, i.e., PLS-DA calculated on autoscaled wavelet coefficients with customized cross-validation (11 deletion groups). PCA, PLS-DA and iPLS-DA models were elaborated by means of PLS Toolbox ver. 6.5.1 (Eigenvector Research Inc.) for Matlab© platform 7.11 R2010b (The MathWorks Inc.). WPTER was written in Matlab©
G. Foca et al. / Food Research International 52 (2013) 185–197
ver. 7.0 language and it uses some routines from the Wavelet Toolbox ver. 3.0 (The MathWorks Inc.) and from the PLS Toolbox ver. 6.5.1.
189
Table 1 Number of samples included in the different sets after outlier elimination. TRN set
3. Results and discussion 3.1. Colorimetric data 3.1.1. Explorative data analysis PCA was performed on the autoscaled data (4 PCs and 99.9% explained variance); the Q–T 2 plot allowed to identify the presence of 22 outlier measurements (less than 3% of the whole dataset), that were removed before classification. The analysis of the score plot revealed the absence of distinct clusters for the Out and In sample categories, suggesting that their discrimination based on colorimetric data is not trivial and confirming what was previously supposed on the basis of the visual evaluation of the samples, since the aspect of the two layers looked very similar. A partial separation between the In and the Out samples was only observed along PC3 axis, as it can be seen in the PC1–PC3 biplot reported in Fig. 1. In particular, the fat samples from the inner layer on the average show higher values of L* compared to those from the outer layer. The initial choice to include Hue and Chroma in the set of colorimetric parameters was due to the fact that they have often proved to be informative when studying the color characteristics of pork meat and fat (Hallenstvedt, Kjos, Øverland, & Thomassen, 2012; Van Oeckel, Warnants, & Boucque, 1999), but the analysis of the present colorimetric data did not confirmed their helpfulness for the particular characterization of In and Out samples.
3.1.2. PLS-DA classification After outlier elimination, a training set and two different test sets were built as it was previously described in Section 2.3.1.1. The number of samples included in each set is reported in Table 1. The results of the PLS-DA model calculated on the colorimetric parameters are shown in the first column of Table 2. The Efficiency values obtained for the prediction of the different test sets objects are rather different: for TST1, EFF is equal to 78.1%, while for TST2 is equal to 51.9% only. These results gave a first confirmation to our hypothesis about the within-sample compositional variability of the two fat layers.
Fig. 1. PC1–PC3 biplot from PCA on colorimetric data. Class Out: light gray circles; class In: dark gray diamonds.
Colorimetric data FT-NIR data (IS) FT-NIR data (FOP)
TST1 set
TST2 set
Out class
In class
Out class
In class
Out class
In class
126 280 284
139 274 276
63 125 128
62 128 128
205 403 393
203 408 408
3.2. FT-NIR data 3.2.1. Explorative data analysis PCA was applied to both the meancentered IS and FOP FT-NIR spectral datasets. As for the IS dataset, a 4 PCs model explaining the 97.4% of the total variance was obtained, after the elimination of 22 outlier spectra based on the Q–T 2 plot. Similarly, for the FOP dataset a 4 PCs model (98.7% explained variance) was obtained after the removal of 23 outlier spectra. The PC1–PC2 score plot obtained on the IS dataset, reported in Fig. 2, highlights that the samples belonging to the two layers are partly separated along PC2, though the clusters are visibly superimposed. Similar results have been obtained for the FOP dataset but, in this case, the separation was due to the contemporary contribution of PC1 and PC2. It has to be noticed that both the score plots showed a slightly wider dispersion of the samples belonging to the outer fat layer (Out class). This finding may indicate that the Out samples present a greater variability as for chemical composition or, in other terms, that they are more heterogeneous with respect to the In ones. A final remark concerns the samples distribution in the score plots of the two datasets. In the FOP score plot, the repeated and replicate measurements made on sample are more spread across the PC space with respect to the corresponding IS score plot. This observation suggests that the sampling technique used for spectra collection affects the reproducibility of the measurements. While the fiber optic probe is generally considered a more rapid and flexible tool, the integrating sphere seems to furnish higher acquisition performances in terms of measurement repeatability and reproducibility. 3.2.2. PLS-DA classification Also in this case, after outlier elimination, the training set and the two test sets have been built; the number of samples included in each set is reported in Table 1. Table 3 reports the cross-validation Efficiency values obtained with the different pretreatments, together with the number of LVs selected for each model. As for the models dimensionality, the FOP dataset in general required a lower number of LVs. The Efficiency values, on the contrary, confirm the primacy of the Integrating Sphere as sampling tool, being the IS results generally higher than the corresponding FOP results. The best models chosen based on the Efficiency values estimated in cross-validation, were obtained using SNV + S + m and S + m for the IS and FOP datasets, respectively. Table 2 reports the classification results of the PLS-DA classification models obtained on IS and FOP datasets, both considering the whole spectra and the restricted spectral range. The IS spectral dataset always gave better results with respect to FOP dataset. In particular, the importance of using a sampling tool that allows it to perform measurements on a wider surface area, as the integrating sphere, seems to be more pronounced when the test samples have not a clearly homogeneous composition. In fact, TST2 samples were predicted with EFF values in the range 61.1–66.6% for IS datasets, while the EFF reached a maximum value of only 55.7% for the FOP datasets. On the contrary, the efficiency in prediction for the TST1 samples was at least equal to 96.0% for all the models, confirming the effectiveness of using FT-NIR spectroscopy to distinguish the fat layers.
190
G. Foca et al. / Food Research International 52 (2013) 185–197
Table 2 Results of the best PLS-DA models obtained on colorimetric and FT-NIR data.
Spectral range (cm−1) # of variables Pretreatment # of LVs EFF (TRN) EFF (CV) EFF (TST1) EFF (TST2)
Colorimetric data
FT-NIR data (IS dataset)
FT-NIR data (IS dataset)
FT-NIR data (FOP dataset)
FT-NIR data (FOP dataset)
/ 5 Autoscale 3 80.0 75.6 78.1 51.9
12,500–3800 4600 SNV + S + m 11 99.3 98.7 97.6 66.6
10,416–5882 2352 MSC + m 11 100.0 99.5 99.2 61.1
12,500–4000 4400 S+m 11 98.8 97.5 96.0 46.5
10,416–5882 2352 m 12 99.1 98.0 96.0 55.7
Test sets results were reported with bold characters, in a manner to highlight the model performance in prediction.
Considering the models obtained on the limited range 10,416– 5882 cm −1, it is interesting to notice that this restricted spectral region gave better prediction results than the whole range with one only exception (EFF of TST2 equal to 61.1% for IS dataset). 3.2.3. Classification after variable selection Since the chemical information contained in a NIR spectrum is redundant, it may be often convenient to apply variable selection methods both to decrease the computational load, that is the main hindrance for the possible online implementation of NIR-based control systems, and to increase the robustness of the prediction models (Gosselin, Rodrigue, & Duchesne, 2010). Indeed, the removal of noisy and/or uninformative features is often recommended to improve the performances of the classification models, since the presence of these variables may be not only useless for the development of the model itself, but is some cases could even be detrimental (Andersen & Bro, 2010). 3.2.3.1. iPLS-DA variable selection. In Table 4 the iPLS-DA model showing the best performance in cross-validation for each FT-NIR dataset has been reported, followed by the number of selected intervals and variables. Three of the best models out of four were obtained using the reverse mode for the intervals selection and generally the intervals including a lower number of variables gave better results. As far as the number of selected variables is concerned, two different comments are noteworthy. Firstly, the models obtained on the spectra limited in the region 10,416–5882 cm −1 required about half the variables of the models obtained on the whole spectra, presumably because of the redundancy of the chemical information revealed by NIR spectroscopy. In other terms, in the whole spectrum, additional overtone bands bringing the same chemical information are present, with respect to the restricted
Fig. 2. PC1–PC2 score plot obtained for IS dataset. Class Out: light gray circles; class In: dark gray diamonds.
range spectrum; therefore the variable selection methods tend to select all the spectral variables where the same chemical information (e.g., first and second overtones of a given bond vibration) is located. A second consideration can be made about the fact that the models obtained on the IS datasets required about half the variables necessary for the FOP datasets. This can be reasonably ascribed to the fact that the spectra acquired using the FOP are less reproducible than the corresponding IS spectra, as it was previously discussed in Section 3.2.1. Therefore, a higher number of spectral variables are needed for the models calculated using the FOP dataset with respect to those based on the IS dataset, in order to decrease the contribution of the stochastic component of the signals. With respect to the corresponding PLS-DA models, the models obtained using the interval-based variable selection method showed a slight increase of the performances in cross-validation, while the results in prediction are almost unchanged. Regarding the model dimensionality, no particular differences are observed in the number of latent variables used to build the classification models before and after the variable selection. 3.2.3.2. WPTER variable selection. The variable selection was also performed on FT-NIR signals by using a further method, in addition to iPLS-DA. The aim was twofold: i) to compare the results of two different variable selection methods, and ii) to evaluate the possibility to obtain better results by taking advantage of the signal multiresolution allowed by wavelets.
Table 3 Comparison of the different pretreatments applied to FT-NIR datasets: number of Latent Variables and performance of the PLS-DA models in terms of Efficiency in cross-validation. The best model for each dataset is highlighted in gray. Pretreatment
Pretreatment
EFF (CV)
# of LVs
IS dataset (12,500–3800 cm−1) N 97.5 12 m 97.3 10 d1 98.6 11 d2 95.5 11 det1 97.7 12 det2 97.5 12 S 97.7 12 SNV 98.0 12 MSC 98.0 12 d1 + m 98.0 10 d2 + m 95.8 11
EFF (CV)
# of LVs
det1 + m det2 + m S+m SNV + m MSC + m d1 + S + m d2 + S + m det1 + S + m det2 + S + m SNV + S + m MSC + S + m
98.2 97.7 97.5 98.6 98.0 98.6 98.0 98.0 98.0 98.7 98.2
12 11 10 12 11 9 9 10 12 11 9
FOP dataset (12,500–4000 cm−1) N 96.6 8 m 96.8 10 d1 95.7 10 d2 91.6 10 det1 96.4 8 det2 96.6 6 S 97.1 10 SNV 96.4 9 MSC 96.2 8 d1 + m 95.5 9 d2 + m 91.9 12
det1 + m det2 + m S+m SNV + m MSC + m d1 + S + m d2 + S + m det1 + S + m det2 + S + m SNV + S + m MSC + S + m
96.4 96.6 97.5 96.8 96.2 96.8 95.4 97.1 97.3 97.1 97.1
7 7 11 7 7 10 10 10 8 12 10
G. Foca et al. / Food Research International 52 (2013) 185–197
191
Table 4 Results of the best iPLS-DA models obtained on FT-NIR data.
Original spectral range (cm−1) # of original variables Pretreatment Interval size (variables) Forward/Reverse modes # of selected intervals/variables # of LVs EFF (TRN) EFF (CV) EFF (TST1) EFF (TST2)
IS dataset (whole spectrum)
IS dataset (restricted range)
FOP dataset (whole spectrum)
FOP dataset (restricted range)
12,500–3800 4600 SNV + S + m 50 Reverse 34/1700 12 100.0 100.0 100.0 64.6
10,416–5882 2352 MSC + m 50 Forward 16/800 10 100.0 100.0 98.0 61.3
12,500–4000 4400 S+m 100 Reverse 38/3800 12 98.9 98.7 96.9 45.4
10,416–5882 2352 m 50 Reverse 34/1700 12 99.2 98.7 95.6 54.6
Test sets results were reported with bold characters, in a manner to highlight the model performance in prediction.
Table 5 reports the results of the best classification models obtained using WPTER, together with the parameters used in the selected WPTER cycle, the number of selected WPTER coefficients and the number of LVs used to build the corresponding PLS-DA classification models. Four different wavelet filters led to the best results for the different datasets, while the same percentage of preselected coefficients, i.e. 10%, was ever selected. The IS models were more parsimonious with respect to the FOP ones, in fact the number of selected WPTER coefficients was at most 14 for IS and at least 42 for FOP. The PLS-DA model dimensionality ranged from 4 to 6 LVs, showing a notable reduction with respect to the models obtained with iPLS-DA. Comparing these results with those obtained using the whole spectra (Table 2), the WPTER algorithm gave slightly lower EFF values for the prediction of TST1 and TST2 (except for TST2 of the whole spectra FOP dataset), but considering a drastically lower number of LVs. More in general, the different variable selection methods did not lead to significant improvements of the efficiency values. 3.2.4. Comparison of the spectral regions useful for classification aims The variable selection methods are by nature conceived to select the portions of a signal that are responsible for the classification of the samples, but also using PLS-DA as classification method is possible to track the signal regions more useful for classification aims by looking at the Variable Importance in Projection (VIP) score plots. The VIP scores furnish an estimation of the importance of each variable in the projection used in a PLS model: the variables that reach values higher than a fixed limit (usually equal to 1) are considered significant for the model. Hence, the portions of the spectrum corresponding to significant variables are identified as the spectral regions useful to classification (Pigani et al., 2011). In order to gain a possible interpretation of the chemical meaning of the selected variables, in Fig. 3 the mean IS and FOP spectra are reported in comparison with the signal regions selected by the different variable selection/classification models or corresponding to VIP scores > 1 for the PLS-DA models on the whole spectra. In particular, the signal
regions in A, B and C refer to the models obtained on IS data and the ones reported in D, E and F to the models obtained on FOP data. Moreover, A and D correspond to the VIP scores > 1 regions, B and E to the intervals selected by iPLS-DA and C and F to the regions selected by WPTER. A similar representation is reported in Fig. 4 for the FT-NIR spectra limited in the 10,416–5882 cm−1 spectral region. The main absorption bands present in a FT-NIR spectrum acquired on a pig fat sample are principally related to the presence of water and of fatty acids. As reported by Prieto, Roehe, Lavin, Batten, and Andres (2009), the O\H bonds are responsible for the broad bands centered at about 6895 and 5155 cm−1, while Pérez-Juan et al. (2010) attributed to the C\H bonds the peaks around 8250 cm−1 (C\H stretch second overtone), 7170 cm−1, 5700 cm−1 (C\H stretch first overtone), 4330 cm−1 and 4260 cm−1 (both combination bands of C\H stretch and deformation). Some regions of the spectrum are indeed informative about the degree of unsaturation in fatty acids carbon chains: the spectral bands in the 4545–4345 cm−1 interval are related to unsaturated _C\H and C_C groups (Cozzolino & Murray, 2004) and the band at 5950 cm−1 is related to the cis unsaturation in carbon chains (Gonzalez-Martin, Gonzalez-Perez, Alvarez-Garcia, & Gonzalez-Cabrera, 2005). In addition, in the proximity of the water-related bands, in particular in the wide 6850–6370 cm − 1 and 5000–4585 cm − 1 regions, also the absorption bands of the N\H bonds are present (Prieto et al., 2009). The N\H bonds in this kind of samples can be ascribable to the presence of connective tissue. The most marked evidence in Fig. 3 is that the iPLS-DA models considered a higher number of variables as important for classification with respect to the other methods and, in particular, these models included many portions of the signals at wavelengths higher than 10,000 cm−1. In general, the different methods made use of most part of the spectral region at wavelengths lower than about 7000 cm−1, where many combination bands and the first overtone vibration bands are located. The five spectral regions where all the best models converged to the same results are highlighted with gray rectangles. These regions correspond to the absorption band related to water at about 5155 cm −1, a part of
Table 5 Results of PLS-DA applied on the WPTER coefficients from the best models obtained on FT-NIR data.
Original spectral range (cm−1) # of variables Wavelet % of preselected wavelet coefficients # of selected WPTER coefficients # of LVs EFF (TRN) EFF (CV) EFF (TST1) EFF (TST2)
FT-NIR data (IS dataset)
FT-NIR data (IS dataset)
FT-NIR data (FOP dataset)
FT-NIR data (FOP dataset)
12,500–3800 4600 db2 10% 14 5 99.1 98.6 97.2 61.7
10,416–5882 2352 sym5 10% 7 6 98.2 97.5 97.2 55.4
12,500–4000 4400 coif5 10% 42 4 98.0 97.9 94.0 54.3
10,416–5882 2352 coif1 10% 58 4 98.0 97.9 91.5 52.8
Test sets results were reported with bold characters, in a manner to highlight the model performance in prediction.
192
G. Foca et al. / Food Research International 52 (2013) 185–197
Fig. 3. Whole original IS (black) and FOP (gray) mean spectra (a) and signal regions selected in the different models (see explanation along the text). The gray vertical rectangles delimit the regions selected by all the models.
Fig. 4. IS (black) and FOP (gray) mean spectra in the range 10,416–5882 cm−1 (a) and signal regions selected in the different models (see explanation along the text). The gray vertical rectangles delimit the regions selected by all the models.
G. Foca et al. / Food Research International 52 (2013) 185–197
the spectral region 4545–4345 cm−1 related to unsaturated _C\H and C_C groups and a little portion of the 6850–6370 cm−1 region, where the N\H absorption band is located. As for the models obtained in the 10,416–5882 cm −1 region (Fig. 4), it can be noticed that also in this case a higher number of variables is selected by iPLS-DA models with respect to the other ones, in particular for the FOP dataset, where the major part of the spectral variables has been selected. The VIP scores above the threshold value in the PLS-DA models and the WPTER coefficients selected for FOP data highlighted the importance of the spectral region in the 8150– 8650 cm−1 interval, where the C\H bonds vibrate in the stretch mode as a result of the second overtone transition. However, the three regions where all the best models converge, highlighted with gray rectangles covering the intervals 7195–7050 cm−1, 6805–6520 cm−1 and 6025–5945 cm−1, include the regions where the vibrations of the C\H bonds, the N\H bonds and the cis unsaturation in carbon chains are located, respectively. 3.2.5. Qualitative survey on the misclassified samples In order to gain an overall survey of the samples that were incorrectly classified in each model, in particular to verify if the same samples
193
have been systematically misclassified by using the different colorimetric and FT-NIR analytical techniques, a representation by means of histograms reporting the percentage of misclassification for each fat sample was implemented (Fig. 5), following the procedure described in Scheme 2. Fig. 5 is divided into three parts: part a) reports the percentage of misclassification for each fat sample belonging to TST1 (the Out class samples are included in the dashed rectangle on the left and the In class samples in the dashed rectangle on the right), part b) reports the percentage of misclassification for each fat sample of class Out belonging to TST2 and part c) the percentage of misclassification for each fat sample of class In belonging to TST2. The histogram representation of the TST1 set (Fig. 5a) which includes only the samples at the opposite sides with respect to the cut between the fat layers, i.e. the samples labeled as Out_up and In_low, put better in evidence with respect to the same results reported in the tables that a relatively low number of samples resulted misclassified and that the colorimetric measurements are the less reliable to the present classification aim. Moreover, Fig. 5a also puts in evidence that the samples belonging to the class Out were misclassified more frequently than the samples of the class In. The same behavior, but much more marked, is observed for the samples of TST2 which includes the samples
Fig. 5. Percentage of misclassification for the fat tissue samples. (a): TST1 samples, Out class samples included in the left dashed rectangle and In class samples in the right dashed rectangle; (b): TST2 samples belonging to class Out; (c): TST2 samples belonging to class In.
194
G. Foca et al. / Food Research International 52 (2013) 185–197
Scheme 2. Representation of the procedure used to implement the histograms representing the percentage of misclassification (example for sample #19 belonging to TST2).
neighboring the cut between the fat layers, i.e. Out_low and In_up. In Fig. 5b) and c) a really high degree of misclassification can be noticed for the samples of class Out with respect to samples of class In, independently from the analytical technique and classification method used to analyze the samples and to process the data. This survey confirms the presence of a real separation surface between the different fat layers, that is reflected in their different physicochemical characteristics. In particular, the samples corresponding to the outer disks seem to have a greater heterogeneity or, in other terms, a greater variability in fatty acid composition and water amount compared to the inner disks. In addition, the misclassification histograms of the Out_up and Out_low samples suggest the possible existence of a sort of gradient in the variation of the chemical composition at different distances from the rind. In retrospect, also the explorative PCA analysis gave yet some rough indications, since in the PC1–PC2 score plot (in Fig. 2 for IS dataset) the class Out samples were more widespread in the PC space than the class In samples.
3.3. Hyperspectral data 3.3.1. Explorative data analysis and image cleaning Among the several combinations of pretreatments that were evaluated, det1 + m allowed the best distinction of the pixel cluster corresponding to the fat tissue from the clusters formed by the background pixels and, where present, by the rind pixels (Fig. 6a). In all the images, the number of significant PCs always resulted equal to 2 (accounting for more than 95% of the total variance). By looking at the corresponding score images, it can be noticed that the first PC allowed a clear distinction of the background (Fig. 6b), while the second PC mainly accounted for the presence of the rind and the edge effects (Fig. 1c). A threshold value equal to 0 was thus applied both on the PC1 score values in order to segment the background and on the PC2 score values to remove the residues of the rind and edge pixels. As an example, the image of the ROI for an Out sample obtained after thresholding is shown in Fig. 1d.
3.3.2. PLS-DA classification and variable selection The PLS-DA models were calculated on all the spectra extracted from the ROIs of the TRN image, i.e. on 9157 spectra of the class Out and on 11,156 spectra of the class In, in order to account for all the variability of the two classes. Several classification models were calculated using different pretreatments and the minimum value of RMSECV was obtained by applying det1 + + m (as for PCA) and using 5 LVs. For both the TST1 and TST2 images, the effectiveness of the model was evaluated by means of the classification efficiency values, which indicate the percentage of pixels (and not of the samples as in the FT-NIR data) correctly classified. Moreover, by taking advantage of the spatial information available, the classification model was evaluated also qualitatively by looking at the pseudo-color image of the predicted class probability values and by comparing these with the known origin of the test samples. The best classification model showed EFF values higher than 99% in cross-validation on the TRN image and higher than 95% in prediction on the TST1 image (Table 6), confirming its ability in distinguish between the two Out and In classes. Moreover, the hypothesis that the lower face of the Out samples could have a different fatty acid composition compared with the upper one was supported by the lower efficiency obtained for the prediction on the TST2 image, as well as by the evaluation of the predicted probability image (Fig. 7b), where it can be notice that all the misclassified pixels belong to the Out_low samples. The forward iPLS-DA algorithm for variable selection, that was applied using 21 intervals composed by 7 variables each, led to the selection of 28 variables out of the 149 original ones, corresponding to the two regions highlighted in gray color in Fig. 8b, in comparison with the VIP scores obtained for the PLS-DA model in Fig. 8a. It must be specified that the variable selection has been made in the original spectral domain in which the HSI system does work, i.e. wavelength (nm), while in the figure the spectral domain has been converted in wavenumber units (cm −1), in a manner to make the HSI results more easily comparable with the FT-NIR ones. The first region selected by iPLS-DA correspond to the 7874– 8547 cm−1 range (corresponding to 1270–1170 nm), and the second
G. Foca et al. / Food Research International 52 (2013) 185–197
195
Fig. 6. PC1–PC2 score plot obtained from the analysis of an Out sample hyperspectral image (a). In (b) and (c) are reported the corresponding PC1 and PC2 score images, respectively. The average HSI image of the same Out sample after thresholding on the first and second PCs is shown in (d).
one to the 7092–7246 cm−1 range (1410–1380 nm). Both the spectral regions include the vibrations typical of the C―H bonds. Conversely, the VIP score plot of the PLS-DA model on the whole spectrum highlighted mainly the importance of the region centered at around 8250 cm−1, also attributed by Pérez-Juan et al. (2010) to the C\H bond (stretch second overtone). The new classification model calculated using the selected intervals resulted to have an optimal dimensionality equal to only 3 LVs, leading to EFF values in cross-validation still higher than 99%. Also the results obtained in prediction on TST1 image as well as on TST2 image were absolutely comparable to those obtained before (small improvements were observed in both cases), confirming the relevance of the selected variables. The predicted probability image (very similar to the one reported in Fig. 7) shows that one of the Out samples in TST2 is almost completely misclassified, except for a sharp triangular portion; this could be probably due to errors made during the manual cut performed to separate the two layers. 4. Conclusions In this work a comparison among the chemical information brought by three different techniques, i.e. tristimulus colorimetry, FT-NIR Table 6 Results of PLS-DA models obtained on hyperspectral images calculated using all the original variables and only the variables selected by iPLS-DA. TRN image (CV) TST1 image TST2 image All variables
# of LVs 5 EFF 99.8 iPLS-DA selected variables # of LVs 3 EFF 99.4
5 95.9 3 96.3
5 84.8 3 85.4
Test sets results were reported with bold characters, in a manner to highlight the model performance in prediction.
spectroscopy and NIR hyperspectral imaging, is presented with the aim to verify the suitability of these techniques to rapidly discriminate fat samples coming from different subcutaneous layers. In fact, in the Italian industry the different fat layers are generally destined to the manufacturing of different end products; as a consequence, the availability of cheap, rapid and affordable methods for the characterization of the overall fat quality is desirable. The results achieved by tristimulus colorimetry showed that a relationship between the colorimetric parameters and the characteristics of fat which render it suitable for specific end-uses does exist, even if the classification did not reach excellent results; conversely, the NIR-based spectroscopic methods gave much more satisfactory results. In particular, the results obtained on FT-NIR data showed that the use of the integrating sphere as sampling tool furnished better classification models with respect to the fiber optic probe, probably due to the higher acquisition performances in terms of spectra reproducibility, and to the wider area of the analyzed sample surface. In addition, on NIR and HSI data some variable selection methods have been used in order to refine the classification results: the model performance did not generally improve significantly; however, the qualitative interpretation of the selected spectral regions led to the identification of bands ascribable to the vibrations of chemical bonds typical of a lipid matrix. A further investigation on the samples that were more frequently misclassified in the colorimetric and FT-NIR models showed a really high degree of misclassification for the samples of class Out with respect to samples of class In, independently from the analytical technique and the classification method used to analyze the samples and to process the data. This observation confirms the presence of a real separation surface between the different fat layers, that is reflected in their different physico-chemical characteristics; the samples corresponding to the outer disks seem in fact to have a greater variability in fatty acid composition and water amount compared to the inner disks.
196
G. Foca et al. / Food Research International 52 (2013) 185–197
Fig. 8. VIP scores for the PLS-DA classification model (a) and selected intervals for the iPLS-DA model (b) obtained on the hyperspectral images dataset shown together with the average spectra of the two classes Out (black) and In (gray).
References
Fig. 7. Predicted probability obtained by PLS-DA on the merged hyperspectral images of the test sets. (a) TST1: the three samples at the top belong to class Out, the other to class In; (b) TST2: the six samples at the top belong to class Out, the other to class In.
A particular remark is finally due to the more than satisfactory results of the hyperspectral imaging analysis. The classification efficiency on the test images and the related predicted probability images showed a good capability of the HSI to discriminate between samples from different fat layers. These results encourage a possible utilization of imaging methods, after a proper engineering of the data elaboration system in a manner to enhance the processing speed, thus allowing to perform quality control in real-time.
Acknowledgments The research was supported by Fondazione Cassa di Risparmio Pietro Manodori (Reggio Emilia).
Andersen, C. M., & Bro, R. (2010). Variable selection in regression — A tutorial. Journal of Chemometrics, 24, 728–737. Antonelli, A., Cocchi, M., Fava, P., Foca, G., Franchini, G. C., Manzini, D., et al. (2004). Automated evaluation of food colour by means of multivariate image analysis coupled to a wavelet-based classification algorithm. Analytica Chimica Acta, 515, 3–13. Burger, J., & Geladi, P. (2005). Hyperspectral NIR image regression part I: Calibration and correction. Journal of Chemometrics, 19, 355–363. Burger, J., & Geladi, P. (2006). Hyperspectral NIR image regression part II: Dataset preprocessing diagnostics. Journal of Chemometrics, 20, 106–119. Carrapiso, A. I., & Garcia, C. (2005). Instrumental colour of Iberian ham subcutaneous fat and lean (biceps femoris): Influence of crossbreeding and rearing system. Meat Science, 71, 284–290. Cocchi, M., Corbellini, M., Foca, G., Lucisano, M., Pagani, M. A., Tassi, L., et al. (2005). Classification of bread wheat flours in different quality categories by a wavelet-based feature selection/classification algorithm on NIR spectra. Analytica Chimica Acta, 544, 100–107. Cocchi, M., Foca, G., Lucisano, M., Marchetti, A., Pagani, M. A., Tassi, L., et al. (2004). Classification of cereal flours by chemometric analysis of MIR spectra. Journal of Agricultural and Food Chemistry, 52, 1062–1067. Cocchi, M., Seeber, R., & Ulrici, A. (2001). WPTER: Wavelet packet transform for efficient pattern recognition of signals. Chemometrics and Intelligent Laboratory Systems, 57, 97–119. Cozzolino, D., & Murray, I. (2004). Identification of animal meat muscles by visible and near infrared reflectance spectroscopy. LWT-Food Science and Technology, 37(4), 447–452. Enser, M. B. (1983). The relationship between the composition and the consistency of pig backfat. Fat quality in lean pigs, workshop in the EEC programme, Brussels, September 20–21, 1983 (pp. 53–57). Ferrari, E., Foca, G., Vignali, M., Tassi, L., & Ulrici, A. (2011). Adulteration of the anthocyanin content of red wines: perspectives for authentication by NIR and 1H NMR spectroscopies. Analytica Chimica Acta, 701(2), 139–151. Foca, G., Cocchi, M., Li Vigni, M., Caramanico, R., Corbellini, M., & Ulrici, A. (2009). Different feature selection strategies in the wavelet domain applied to NIR-based quality classification models of bread wheat flours. Chemometrics and Intelligent Laboratory Systems, 99, 91–100. Forina, M., Oliveri, P., Jäger, H., Römisch, U., & Smeyers-Verbeke, J. (2009). Class modeling techniques in the control of the geographical origin of wines. Chemometrics and Intelligent Laboratory Systems, 99, 127–137. Gandemer, G. (2002). Lipids in muscles and adipose tissues, changes during processing and sensory properties of meat products. Meat Science, 62, 309–321. Gjerlaug-Enger, E., Kongsro, J., Aass, L., Ødegard, J., & Vangen, O. (2011). Prediction of fat quality in pig carcasses by near-infrared spectroscopy. Animal, 5(11), 1829–1841.
G. Foca et al. / Food Research International 52 (2013) 185–197 Gonzalez-Martin, I., Gonzalez-Perez, C., Alvarez-Garcia, N., & Gonzalez-Cabrera, J. M. (2005). On-line determination of fatty acids composition in intramuscular fat of Iberian pork loin by NIRs with a remote reflectance fibre optic probe. Meat Science, 69, 243–248. Gonzalez-Martin, I., Gonzalez-Perez, C., Hernandez-Mendez, J., & Alvarez-Garcia, N. (2003). Determination of fatty acids in the subcutaneous fat of Iberian breed swine by near infrared spectroscopy (NIRS) with a fibre-optic probe. Meat Science, 65, 713–719. Gosselin, R., Rodrigue, D., & Duchesne, C. (2010). A bootstrap-VIP approach for selecting wavelength intervals in spectral imaging applications. Chemometrics and Intelligent Laboratory Systems, 100, 12–21. Gowen, A. A., O' Donnell, C. P., Cullen, P. J., Downey, G., & Frias, J. M. (2007). Hyperspectral imaging — An emerging process analytical tool for food quality and safety control. Trends in Food Science & Technology, 18, 590–598. Hallenstvedt, E., Kjos, N. P., Øverland, M., & Thomassen, M. (2012). Changes in texture, colour and fatty acid composition of male and female pig shoulder fat due to different dietary fat sources. Meat Science, 90, 519–527. Kamruzzaman, M., ElMasry, G., Sun, D. -W., & Allen, P. (2011). Application of NIR hyperspectral imaging for discrimination of lamb muscles. Journal of Food Engineering, 104(3), 332–340. Kobayashi, K. I., Matsui, Y., Maebuchi, Y., Toyota, T., & Nakauchi, S. (2010). Near infrared spectroscopy and hyperspectral imaging for prediction and visualisation of fat and fatty acid content in intact raw beef cuts. Journal of Near Infrared Spectroscopy, 18(5), 301–315. Leardi, R., & Norgaard, L. (2004). Sequential application of backward interval PLS and genetic algorithms for the selection of relevant spectral regions. Journal of Chemometrics, 18(11), 486–497. Lebret, B., & Mourot, J. (1988). Caractéristiques et qualité des tissus adipeux chez le porc. Facteurs de variation non génétiques. INRA Productions Animales, 11, 131–143. Lo Fiego, D. P. (1996). Carcass fatness and lipid quality in the heavy pig. Meat Focus International, 5(8), 261–263. Lo Fiego, D. P., Tedeschi, M., Santoro, P., & Nanni Costa, L. (1987). Ricerche sul contenuto di idrossiprolina nel tessuto adiposo del suino pesante. Atti della Società Italiana delle Scienze Veterinarie, XLI, 728–730 (Part II). Malmfors, B., Lundstrom, K., & Hansson, I. (1978). Fatty acid composition of porcine backfat and muscle lipids as affected by sex, weight and anatomical location. Swedish Journal of Agricultural Research, 8, 25–38. Mersmann, H. J., & Leymaster, K. A. (1984). Differential deposition and utilization of backfat layers in swine. Growth, 48, 321–330. Müller, M., & Scheeder, M. R. L. (2008). Determination of fatty acid composition and consistency of raw pig fat with near infrared spectroscopy. Journal of Near Infrared Spectroscopy, 16(3), 305–309.
197
O'Farrell, M., Wold, J. P., Høy, M., Tschudi, J., & Schulerud, H. (2010). On-line fat content classification of inhomogeneous pork trimmings using multispectral near infrared interactance imaging. Journal of Near Infrared Spectroscopy, 18(2), 135–146. Pérez-Juan, M., Afseth, N. K., González, J., Díaz, I., Gispert, M., Font i Furnols, M., et al. (2010). Prediction of fatty acid composition using a NIRS fibre optics probe at two different locations of ham subcutaneous fat. Food Research International, 43, 1416–1422. Pérez-Marín, D., De Pedro Sanz, E., Guerrero-Ginel, J. E., & Garrido-Varo, A. (2009). A feasibility study on the use of near-infrared spectroscopy for prediction of the fatty acid profile in live Iberian pigs and carcasses. Meat Science, 83, 627–633. Pigani, L., Culetu, A., Ulrici, A., Foca, G., Vignali, M., & Seeber, R. (2011). Pedot modified electrodes in amperometric sensing for analysis of red wine samples. Food Chemistry, 129, 226–233. Prieto, N., Roehe, R., Lavin, P., Batten, G., & Andres, S. (2009). Application of near infrared reflectance spectroscopy to predict meat and meat products quality: A review. Meat Science, 83, 175–186. Rinnan, A., Van den Berg, F., & Engelsen, S. B. (2009). Review of the most common pre-processing techniques for near-infrared spectra. Trends in Analytical Chemistry, 28, 1201–1222. Ripoche, A., & Guillard, A. S. (2001). Determination of fatty acid composition of pork fat by Fourier transform infrared spectroscopy. Meat Science, 58, 299–304. Santoro, P. (1983). Fat quality in pig meat with special emphasis on cured and seasoned raw hams. Fat quality in lean pigs, workshop in the EEC programme, Brussels, September 20–21, 1983 (pp. 43–46). Ulrici, A., Cocchi, M., Durante, C., Foca, G., Marchetti, A., & Tassi, L. (2008). Multivariate analysis of analytical signals to decipher relevant chemical information. In L. Tassi, & M. P. Colombini (Eds.), New trends in analytical, environmental and cultural heritage chemistry (Chpt. 5). Trivandrum: Research Signpost. Van Oeckel, M. J., Warnants, N., & Boucque, Ch. V. (1999). Measurement and prediction of pork colour. Meat Science, 52, 347–354. Walczak, B. (2000). Wavelets in chemistry (1st ed.). Amsterdam: Elsevier. Williams, P. J., Geladi, P., Britz, T. J., & Manley, M. (2012). Near-infrared (NIR) hyperspectral imaging and multivariate image analysis to study growth characteristics and differences between species and strains of members of the genus Fusarium. Analytical and Bioanalytical Chemistry, 404, 1759–1769. Wood, J. D., Enser, M., Fisher, A. V., Nute, G. R., Sheard, P. R., Richardson, R. I., et al. (2008). Fat deposition, fatty acid composition and meat quality: A review. Meat Science, 78, 343–358. Wood, J. D., Richardson, R. I., Nute, G. R., Fisher, A. V., Campo, M. M., Kasapidou, E., et al. (2003). Effects of fatty acids on meat quality: A review. Meat Science, 66, 21–32.