Postharvest Biology and Technology 115 (2016) 81–90
Contents lists available at ScienceDirect
Postharvest Biology and Technology journal homepage: www.elsevier.com/locate/postharvbio
Color compensation and comparison of shortwave near infrared and long wave near infrared spectroscopy for determination of soluble solids content of ‘Fuji’ apple Zhiming Guoa,b,* , Wenqian Huangb , Yankun Pengc, Quansheng Chena,* , Qin Ouyanga , Jiewen Zhaoa a
School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China National Engineering Research Center of Intelligent Equipment for Agriculture, Beijing100097, China c College of Engineering, China Agricultural University, Beijing 100083, China b
A R T I C L E I N F O
A B S T R A C T
Article history: Received 22 June 2015 Received in revised form 10 December 2015 Accepted 22 December 2015 Available online 29 December 2015
Shortwave near infrared (SWNIR) and long wave near infrared (LWNIR) spectroscopy with a novel color compensation method were compared to predict soluble solids content of apple. Linear and nonlinear regression models were considered. Eventually, independent component analysis-support vector machine (ICA-SVM) models proved to be superior to other nonlinear models. Rp was 0.9398 and RMSEP was 0.3870% for the optimal model of SWNIR, while Rp was 0.9455 and RMSEP was 0.3691% for that of LWNIR. Moreover, the results showed that color compensation could significantly improve the prediction performance of SWNIR model. Our work implies that SWNIR with color compensation has an obvious prospect in practical industrial use for real-time monitoring apple quality. ã 2015 Elsevier B.V. All rights reserved.
Keywords: Apple Near infrared spectroscopy Color compensation Soluble solids content Independent Component analysis Support vector machine
1. Introduction Apple is one of the most important fruit in international trade. The current annual worldwide apple production of commodity is estimated at over 80 million tons (FAO-STAT 2013). Over the years, consumers' attention to fruit quality is not limited to external attributes including color, size and shape (Zhang et al., 2014), but further extends to internal attributes such as the soluble solids content (SSC) and the nutritional content (Mendoza et al., 2014; Kumar et al., 2015). SSC is the most important internal quality index of apple, and also relates to fruit maturity and harvest time (Drogoudi and Pantelidis, 2011). Knowledge of the SSC may enhance the competiveness and profitability of the industry and assure consumer acceptance and satisfaction. To face this challenge, a recent trend in agribusiness is a declining reliance on subjective assessments of quality and increasing adoption of objective and innovative techniques for sensing and measuring the
* Corresponding author at: School of Food and Biological Engineering, Jiangsu University. Fax: +86 511 88780201. E-mail address:
[email protected] (Z. Guo). http://dx.doi.org/10.1016/j.postharvbio.2015.12.027 0925-5214/ ã 2015 Elsevier B.V. All rights reserved.
internal quality attributes of fruit (Magwaza et al., 2012; Chen et al., 2013). Near-infrared (NIR) spectroscopy has been proposed as an alternative to traditional methods, and gained abundant acceptance for rapid analysis of a wide variety of agricultural products and parameters, because it is a non-destructive, low-cost, accurate and reliable method, and can be used to analyze multiple attribute simultaneously (Nicolaï et al., 2007; Burns and Ciurczak, 2007; Siesler et al., 2008; Alfatni et al., 2013). Besides, a fully calibrated and validated NIR model can be easily handled without special skills in spectroscopy. NIR spectroscopy is concerned with both electronic transitions and vibrational transitions (Ozaki, 2012.). Thus far, most of applications of NIR spectroscopy are based on vibrational spectroscopy. NIR is based on the principle that different chemical bonds in organic matter absorb or emit light of different wavelengths when the sample is irradiated (Chen et al., 2015). NIRS is ideally suitable for the requirements of the agrofood industry in terms of both quality control and traceability (HuckPezzei et al., 2014). Generally, two spectrometers were used in this research: one for the shortwave near infrared (SWNIR) range and one for long wave near-infrared (LWNIR) range. SWNIR spectroscopy (700–1100 nm) is mainly based on 3rd or 4th overtones of vibrations; thus, NIR energy in the SWNIR region has a stronger
82
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
penetration and a lower heating, in contrast to that in the LWNIR region (1100–2500 nm). LWNIR spectroscopy addresses the 1st or 2nd overtones and combination vibrations with a stronger absorptivity. Most of the related literature references have focused on assessing internal quality attributes such as SSC, available acidity, firmness and dry matter in intact apples (Butz et al., 2005; Nicolaï et al., 2014). Bochereau et al. (1992) applied near infrared spectra combined with multilayer neural networks to estimate the SSC in apple. A host of investigators have extensively evaluated the fundamental relationship between the internal quality attributes and the response of NIR spectra (Lammertyn et al., 1998; Lu et al., 2000; Peirs et al., 2001; Xiaobo et al., 2007). A relationship was also found between different Vis/NIR wavelengths and sensory attributes for apples (Mehinagic et al., 2003; Bobelyn et al., 2010; Eisenstecken et al., 2015). The potential of NIR for determination of sweetness, sourness and their ratio for five cultivars of apple was investigated by Jha and Garg (2010). Visible and SWNIR spectroscopy and spectral scattering techniques were compared to sort apples into two quality groups by firmness, SSC, or their combination (Mendoza et al., 2014). Spectroscopic measurements in the visible and NIR wavelength range have achieved success in non-destructive assessment of apple quality attributes contributed by physicochemical and nutraceutical characteristics inside the fruit. Nevertheless, there are little complete commercial solutions available those permit the complete requirements for on-line and on-site measurement of fruit quality attributes, for which accuracy and robustness requirements are the most important technical aspects. The spectrum used for analysis is sensitive to variations in temperature, color and spectrophotometer state. Some researchers have studied temperature compensation when working with NIR data (Jiang et al., 2008; Qin and Lu, 2008). External parameter orthogonalisation as a preprocessing method has been proposed to remove the influence of the fruit temperature on the sugar content measurement of intact apples (Roger et al., 2003). Two low-cost approaches based on simulation and prior knowledge about temperature perturbations has been proposed for handling irrelevant spectral variation by Segtnan et al. (2005). Temperature compensation was established for watermelon juice samples (Yao et al., 2013). The mechanism of the effect of temperature on NIR spectra is related to the thermal properties of hydrogen bonding. Much information is available about fruit quality grading based on color, but little information has been published about the influence of the peel color on quality determination by NIR. Differences in fruit color were not related to differences in fruit quality parameters (Iglesias et al., 2012; Merzlyak et al., 2003). In apple, the main pigments responsible for peel color are chlorophylls, carotenoids and anthocyanins (Delgado-Pelayo et al., 2014). Spatially resolved steady-state diffuse reflectance technique has been used for measurement of the optical properties of fresh fruit and vegetables. Spectra of the absorption coefficient were featured by major pigments (chlorophyll, anthocyanin, and carotenoid), whereas spectra of the reduced scattering coefficient generally decreased with the increase of wavelength (Qin and Lu, 2008). Obstacles including overlapping light absorption by pigments in peel and the non-linear relationship of the spectral response versus pigment content complicate non-destructive assessment of SSC in apple fruit. The precision of SSC determination by NIR measurement of whole fruit is limited by the contribution of peel pigments to the overall light absorption, which is difficult to estimate (Merzlyak et al., 2003). Nevertheless, the peel pigments can be described as spectral perturbation caused by an absorbance shift. The objective of this study was (1) to determine the effect of fruit peel color on measurements of the SSC in apple by NIR and
eliminate it by means of color compensation methods; (2) to compare the performance of SWNIR (500–1000 nm) and LWNIR (1000–2500 nm) spectroscopy in calibration models obtained with different instruments, different wavelength ranges, and different multivariate regression models; and (3) to establish relationships between NIR spectra and the surface color relating to the internal quality of Fuji apple by the color compensation model. 2. Materials and methods 2.1. Preparation of apple fruit samples ‘Fuji’ apple (Malus x domestica Borkh. cv. ‘Red Fuji’) has recently become popular due to its distinctive quality and high economic return, and was considered in the present work. Sample collection was conducted at Changping Fuji Apple Research and Demonstration Station, Beijing, China. In order to provide a high range of variability, apple samples were collected from three orchards with different cultivars and management model. Those orchards were located at about E116 0401700 and N10 460 7100, had an elevation of 113 m above sea level, an annual temperature ranging between 11.5 C and 12.0 C, and an annual rainfall of 550 mm. The ‘Fuji’ apple samples were collected in mid-october 2014. A total of 160 ‘Fuji’ apples free from any abnormal features such as defects, bruises, diseases and contamination were selected, and the equatorial diameter range of apples was 75–80 mm. As soon as the samples were received in the laboratory, apple fruit were stored at 4 C and 90% relative humidity until the time of analysis. Before the acquiring the spectra, samples were taken out from cold storage and placed under controlled circumstances (20 C, 60% relative humidity) for 12 h to equilibrate. After the nondestructive measurements using two different spectrometers, the SSC of the test apples was measured, using standard destructive methods as described further, from the same location where spectra were acquired. Then, all 160 samples were randomly divided into two subsets. The first subset was called the calibration set with 106 samples and was be used for building model, while the other one was called the prediction set with 54 samples and was to be used for testing the robustness of the model. 2.2. Acquisition of spectra The samples were analyzed by spectroscopy in the SWNIR region (500–1100 nm) and LWNIR (10000–4000 cm1) region. Spectra were collected for all samples in diffuse reflectance mode and expressed as log(1/R). Two laboratory spectrometers were used in this research. LWNIR spectra were obtained using an AntarisTM II method development sampling (MDS) system (Thermo Fisher Scientific Inc., Madison, WI, USA) spectrometer interfaced to a personal computer using the software Result Integration 3.0. Spectra were recorded between 10,000 and 4000 cm1 at 4 cm1 resolutions by co-adding 32 scans using an integrating sphere and an empty cell as a reference. LWNIR measurements were performed with a high sensitivity InGaAs detector with a tungsten lamp as the NIR source. SWNIR spectra of each sample were measured with an Ocean Optics model USB2000 + fiber spectrometer (Ocean Optics Inc., Dunedin, FL, USA), with a 2048-element linear silicon detector array, about 0.3 nm sampling interval, and 0.27–0.38 nm optical resolution FWHM in the 500– 1100 nm range. The fiber spectrometer was connected to a computer and controlled by SpectraSuite software of the same company. Spectra were collected with an integration time of 500 ms and the average of 3 scans per measurement. Spectra collection in the two spectrometers was performed at the same equatorial position to facilitate comparison.
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
2.3. Color and sugar content measurement The International Commission on Illumination (CIE) Lab color space most closely represents human color perception. In this paper, apple skin color was measured with a HP-200 portable tristimulus colorimeter (Puxi Corp, Shanghai, China) and recorded in CIE color space coordinates (L*, a* and b*). The colorimeter was standardized by using rectification plates before measurement. The inspection angle was 10 . The apple skin color was measured at the equatorial position corresponding to that where spectra were acquired. In order to assess the real quality parameters of fruit, the SSC was determined following the spectral measurement using traditional destructive tests as a reference. Apple juice was squeezed using a manual fruit squeezer from the same position with spectra acquisition. The SSC of the juice was recorded with a temperature correction digital refractometer (Model Arias 500, Reichert Inc., NY, USA) and expressed in% at 20 C. This dual-array, automatic refractometer eliminates the need for shadow line intersects interpretation and delivers consistently repeatable results. 2.4. Multivariate calibration The quality of the calibration model was evaluated based on the root mean square error of calibration (RMSEC) and the correlation coefficient (Rc) in the calibration set, and tested using the mean square error of prediction (RMSEP) and the correlation coefficient (Rp) in the prediction set. Low values of RMSEP indicate a good prediction performance of the model. All computations were performed using a Matlab software (Mathworks Inc., USA) under Windows 7. Fig. 1 shows the main procedures for color compensation and comparison of calibration models for SSC in apple using NIR. 2.4.1. Synergy interval partial least squares-successive projections algorithm Variable selection can be regarded as a combinatorial optimization problem involving the minimization of a cost function related to the quantitative analysis of NIR spectra. It is important to select specific regions containing much information that generates more stable models with superior interpretability. A graphically oriented local multivariate calibration modeling procedure called synergy interval partial least squares (siPLS) was applied to select spectral regions that establish the combinations of spectral regions and therefore would result in the best statistics (Chen et al., 2012). The siPLS algorithm develops PLS regression models for all possible
83
combinations of different intervals. The combination of intervals with the lowest RMSEC is chosen. The successive projections algorithm (SPA) has been proposed as a forward selection approach for multivariate calibration to minimize variable multicollinearity. In SPA, the selection of variables is cast in the form of a combinatorial optimization problem with constraints. The goal of SPA consists of finding a small representative set of spectral variables with an emphasis on the minimization of collinearity (Soares et al., 2013). In the present work, siPLS and SPA are sequentially applied to search for an optimized spectral interval and an optimized combination of spectral regions selected/obtained from informative regions in model calibration (Wu et al., 2010). Here, SiPLS is firstly attempted to select several spectral regions of efficient subintervals, and then SPA is employed to choose fewer effective variables from the spectral regions of significant signals selected by siPLS. The final informative variables were selected by the siPLS-SPA model that gave the best performance with respect to RMSEC. 2.4.2. Independent component analysis Independent component analysis (ICA) is a newly developed statistical approach to separate blind source, independent source variables from the observed variables that are the combinations or mixtures of these source variables. Because the NIR spectrum of a mixture can be considered, in a first approximation, as the linear addition of individual spectra of the constituents in the mixture, such a spectrum thus can be regarded as an assembly of ‘blind sources’ as the proportion of constituents in the samples remains unknown (Hyvärinen and Oja, 1997). Since NIR spectra are the combined result of spectra of individual components, it would be useful if the component spectra could be separated. Especially in some circumstances where the spectra of the chosen samples are themselves mixtures of the unknown pure constituent spectra, the separation of these unknown spectra would become very important in identifying the unknown pure components (Chen and Wang, 2001), and this work shows that ICA can be used to obtain statistically latent variables (LVs) to multivariate regression. The LVs extraction as independent components (ICs) is performed by the application of a fast and reliable algorithm for ICA called fast ICA (http://www.cis.hut.fi/projects/ica/fastica/). This algorithm was chosen mainly because of its robustness and computational speed and because of the fact that it is available oline at http:// research.ics.aalto.fi/ica/fastica/. The ICs are latent variables; therefore, they can not be directly observed. This indicates that the mixing matrix, the intensity of the sources among the observed signals, is also unknown. The purpose of the ICA algorithm is to determine the mixing matrix (M) or its inverse, the separating
Fig. 1. Schematic diagram of the experimental procedure.
84
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
Table 1 Descriptive statistics of SSC (%) and color parameters measured by standard methods for apple samples. Subset
Number of samples
Index
Range
Mean
Standard deviation
Calibration set
106
Prediction set
54
SSC L* a* b* SSC L* a* b*
10.81 16.50 56.96 89.04 4.77 48.89 3.85 32.85 11.31 16.11 60.97 87.65 2.93 40.08 8.54 35.58
13.41 75.42 17.89 19.93 13.45 74.7 18.78 19.86
1.248 7.423 13.08 6.384 1.143 7.322 12.16 6.796
matrix. The equation can be expressed as Aln ¼ Mln Imn
ð1Þ
where Aln represents the observed spectra of the objects and Imn is the estimation of the independent components. According to the Beer-Lambert law, and considering the concentration matrix C and the corresponding mixing matrix Mc ; the relationship can be described as C ¼ Mc B
ð2Þ
where B is the matrix of regression coefficients. In this paper, ICA compared with principal component analysis (PCA) is implemented for input data reduction to test whether the predictions can be improved. 2.4.3. Least squares support vector regression Support vector machines (SVM) were proposed by Vapnik to solve classification problems (Vapnik, 1998). Recently, it has been proved that SVM can be used as a candidate for spectral regression purposes, and applied to chemometric problems for nonlinear quantitative prediction. The basic idea of SVM is, at first, to provide a nonlinear function approximation by mapping the input vectors into high dimensional feature spaces where a special type of hyperplane is constructed, and then, to build a regression model in the hyperplane. The least squares support vector regression (LSSVR) method is known as a proficient of SVM; complex calculations as in SVM are avoided in LS-SVR (Chauchard et al., 2004; Li et al., 2007; Naguib et al., 2014). In contrast to SVM, LS-SVR uses equality constraints for the errors instead of inequality constraints and a squared loss function. Determination of proper kernel function and optimum kernel parameters is the crucial element in LS-SVR. The radial basis function (RBF) kernel is by far the most popular kernel choice because of its localized and finite response between the spectra and response values. In this study, the Gaussian RBF kernel was used. In this work, the LS-SVR method was proposed for building the calibration model of NIR spectra and compared with conventional partial least squares (PLS) regression models and artificial neural networks (ANN). 3. Results 3.1. Measurement data description The soluble solids content values for the 120 apple samples ranged from 10.81 to 16.50%. The mean SSC was 13.42%, the median was 13.48% and the standard deviation was 1.21%. The distribution of the data was similar to a normal distribution, which is important in statistical analysis. A reasonable range of quality variation is necessary to establish a robust calibration model since a narrow range could negatively affect the prediction accuracy of the quality attributes. Table 1 shows the statistics summary of SSC and color parameters. As seen from Table 1, the range of reference measurements of SSC in the calibration set almost covers the
range in the prediction set. There is no significant difference between the standards deviations of the calibration and the prediction sets. Therefore, the distribution of the samples is appropriate both in the calibration and the prediction sets. Color space coordinates a* and b* have a high variability in apples. The a*/b* ratio and also a* values was directly related to the anthocyanin content (Iglesias et al., 2012). Consumer acceptability scores were not correlated with high anthocyanin levels or with a*/b* on either the exposed or the shaded side. 3.2. Spectral characteristics analysis The diffuse reflectance spectra acquired from apple samples within a wavelength range of 500–1000 nm and 10000–4000 cm1 are illustrated in Fig. 2 line 1, respectively. A similar general trend throughout the examined wavelength region at different SSC values was observed, but there were some differences in the magnitude of spectral intensity that were in agreement with similar results described in the literature (Liu and Ying, 2005; Bureau et al., 2009; Magwaza et al., 2012). It can be concluded that the spectra peaks are the inherent of fruit constituents, while the difference of spectra intensity can be related to the component content of fruit. The absorption has been reported to be influenced mainly by scattering, a physical phenomenon that is dependent on the density, cell structures, and cellular matrices of fruit tissue (Mendoza et al., 2014). There is an absorbance at 675 nm which may be related to peel chlorophyll content of apples (Merzlyak et al., 2003). The negative peaks at 635 and 712 nm may be associated with pigmented compounds such as anthocyanins in the visible regions (Janik et al., 2007). The low absorption values in the region between 700 and 830 nm probably do not contain important information related to apple quality (Martínez Vega et al., 2013). In the SWNIR region, there was a significant absorption at about 760 nm, mainly related to the third overtone O H stretching. An absorption peak of water was observed close to 960 nm and may be associated with the second overtone O H stretching. The peak appeared at about 8265 cm1 was attributed to the second overtone of the C H stretching. In addition, there are two obvious peaks in LWNIR. An absorption peak at about 6890 cm1 was also observed that was related to the first overtone of the O H stretching (water absorption), and another peak between 5200 and 5100 cm1 was related to the O H functional group (1st overtone of the combination mode) (Magwaza et al., 2012; Nicolaï et al., 2007). According to the above analysis, it could be concluded that there was no exclusive feature absorption band for SSC. Therefore, mining and expressing the implicative and hidden information should be conducted to indirectly predict SSC. The Cochran test was used to detect outlying replicates, and no outlying replicates were detected (Centner et al., 1996). Baseline shifts and noises in the spectra with broad wavelength regions were observed, so that spectral preprocessing techniques were used to remove any irrelevant information which cannot be handled properly by the regression techniques. Several pretreatment methods for NIR spectra including derivatives (first and
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
SWNIR
85
LWNIR 1.5
1.3 1.2 1.1
Log(1/R)
SWNIR Log(1/R)
SiPLS spectral selection
1.4
1 0.9 0.8 0.7 0.6
0.5
0.5 0.4 600
700
800 Wavelength (nm)
900
14000
1000
R =0.9241 RMSEC=0.4764 c R =0.8900 RMSEP=0.5173
15 14 13 12 11
6000
4000
11
12
13 14 15 Reference Values (%)
16
R =0.9413 RMSEC=0.4197 c R =0.9131 RMSEP=0.4683 p
15 14 13 12 Calibration da ta Prediction da ta
10 10
17
11
12
13 14 15 Reference Values (%)
16
17
1.3 1.2
1.2 1.1
1 Log(1/R)
Log(1/R)
1 0.9 0.8 0.7
0.8
0.6
0.6 0.5
0.4
0.4 0.3
600
700
800 900 Wavelength (nm)
0.2 14000
1000
R =0.9226 RMSEC=0.4791 c R =0.8792 RMSEP=0.5421
16
p
FT-NIR predicted value (%)
SWNIR predicted value (%)
16 15 14 13 12 11 10 10
11
12
13 14 15 Reference Values (%)
16
4000
p
15 14 13 12 Calibration data Prediction data 11
12
13 14 15 Reference Values (%)
16
17
17 R c =0.9341 RMSEC=0.4434 R p=0.9192 RMSEP=0.4473
16 FT-NIR predicted value (%)
SWNIR predicted value (%)
6000
R =0.9341 RMSEC=0.4436 c R =0.9029 RMSEP=0.4704
10 10
17
15 14 13 12 11 10 10
10000 8000 Wavenumber (cm-1)
11
Calibration data Prediction data
17 16
12000
17
17
SiPLS-SPA model
10000 8000 Wavenumber (cm-1)
11
Calibration data Prediction data
10 10
SiPLS-SPA variables selection
16
p
FT-NIR predicted value (%)
SWNIR predicted value (%)
SiPLS model
16
12000
17
17
SiPLS-SPA-Lab model
1
12
13 14 15 Reference Values (%)
16
p
15 14 13 12 11
Calibration data Prediction data 11
R =0.9406 RMSEC=0.4217 c R =0.9161 RMSEP=0.4611
17
10 10
Calibration data Prediction data 11
12
13 14 15 Reference Values (%)
16
17
Fig. 2. Spectra feature selection and linear regression models for both SWNIR and LWNIR.
second), Savitzky–Golay smoothing, multiplicative scatter correction (MSC) and standard normal variate (SNV) were implemented
and compared (Guo et al., 2011). After some trials and computation, as no much better improvement was obtained, these pre-
86
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
treatment techniques were discarded. SWNIR and LWNIR raw absorbance spectra were applied to build calibration models in the further analysis, respectively. 3.3. Modeling based on linear regression 3.3.1. PLS models In this study, the spectral range of 550–1050 nm for SWNIR and 14000–4000 cm1 for LWNIR was considered to construct regression models. Calibrations were performed to avoid overfitting data and obtain a reliable model. PLS regression was used to establish calibration models for the full spectrum. The optimum number of PLS factors will decompose the spectral data matrix into a structural component and noise. The optimum number of factors is determined by the lowest RMSEC. PLS regression models were developed for SSC prediction in apple. Results obtained using SWNIR and LWNIR spectroscopy are shown in Table 2. This study suggests that both SWNIR and LWNIR have a great potential to be used as a nondestructive tool for SSC detection of apple. However, the prediction accuracy was fairly low and not suitable for further industrial applications (Rc > 0.91 and Rp > 0.87) models. This might be because the classical PLS model was constructed using the full spectrum which contains both useful and useless information. One reason for this is the incorporation of color information in the NIR spectra (Chia et al., 2012). The useless data inevitably reduce the general performance of the model. 3.3.2. SiPLS models The full-spectrum region was split into 20 independent equidistant subintervals, and selection of subintervals by siPLS was carried out in order to evaluate the decrease in RMSEP value using the combination of four subintervals. Table 2 shows the statistical results for SSC determination using both SWNIR and LWNIR and siPLS calibration. For the SWNIR range, the optimal combination of selected intervals was [9, 13, 14, 16], corresponding to 764–790, 865–889, 899–912 and 937–959 nm. For the LWNIR range, the optimal combination was [5, 7, 8, 11], corresponding to 5922–6425, 6932–7436, 7438–7941 and 8953–9457 cm1. The optimal combination of selected intervals are shown in Fig. 2. Table 2 Performance of multivariable calibration models for SSC prediction of apples using different calibration methods for the SWNIR and LWNIR. Spectral
Method
Rc
RMSEC%
Rp
RMSEP%
SWNIR
PLS siPLS siPLS-SPA siPLS-SPAf PCA-ANN PCA-ANNf PCA-SVM PCA-SVMf ICA-ANN ICA- ANNf ICA-SVM ICA-SVMf PLS siPLS siPLS-SPA siPLS-SPAf PCA-ANN PCA-ANNf PCA-SVM PCA-SVMf ICA-ANN ICA- ANNf ICA-SVM ICA-SVMf
0.9160 0.9241 0.9226 0.9341 0.9339 0.9412 0.9289 0.9379 0.9309 0.9482 0.9361 0.9525 0.9148 0.9413 0.9341 0.9406 0.9319 0.9365 0.9398 0.9435 0.9362 0.9555 0.9499 0.9604
0.4984 0.4764 0.4791 0.4434 0.4470 0.4215 0.4674 0.4350 0.4546 0.3960 0.4422 0.3802 0.5038 0.4197 0.4436 0.4217 0.4588 0.4374 0.4281 0.4180 0.4370 0.3668 0.3940 0.3553
0.8731 0.8900 0.8792 0.9192 0.8742 0.9194 0.9057 0.9263 0.9114 0.9315 0.9133 0.9398 0.8959 0.9131 0.9029 0.9161 0.9185 0.9092 0.9110 0.9272 0.9240 0.9362 0.9347 0.9455
0.5566 0.5173 0.5421 0.4470 0.5524 0.4525 0.4800 0.4325 0.4762 0.4135 0.4706 0.3870 0.5044 0.4683 0.4704 0.4611 0.4556 0.4724 0.4716 0.4273 0.4341 0.4001 0.4092 0.3691
LWNIR
These spectral regions selected are clearly correlated to the SSC content in apple. The predictive performance of the siPLS models was better when compared to full-spectrum PLS models for both the validation and external prediction set. It is desirable to employ region selection algorithms for reducing the computation burden and improving the predictive ability of the calibration model, which is a critical step for further analysis. 3.3.3. SiPLS-SPA models Based on four subintervals selected by the siPLS model, the SPA algorithm was used to select effective variables from these regions. The selected nine wavelengths are showed in Fig. 2, and the Rc, RMSEC, Rp and RMSEP of the siPLS-SPA model based on these variables were 0.9226, 0.4791%, 0.8792, 0.5421% for the SWNIR range, respectively. Meanwhile, the selected wavenumbers were 5989, 6363, 5922, 7176, 7338, 9449 and 9409 cm1, and the Rc, RMSEC, Rp and RMSEP of the siPLS-SPA model based on these variables were 0.9341, 0.4436%, 0.9029, 0.4704% for the LWNIR range, respectively. The results show that SPA could find optimal values for several disparate variables associated with the calibration model; also, the siPLS procedure could be integrated into the objective function driving the optimization. The accuracy and robustness of the prediction models is similar with siPLS models, only using much less variables. In order to further analyze the influence of peel color on SSC calibration model, color space coordinates as original input variables combined with the selected variables by SPA were used to build a multivariate regression compensation model. Data fusion is a technique that seamlessly integrates information from disparate sources to produce a single model or decision. Fig. 2 shows the scatter plots indicating the correlation between reference measurements and NIR predictions in the calibration and prediction sets, respectively. The results of siPLS-SPA fusion models for both SWNIR and LWNIR range are better than those of siPLS-SPA models. Therefore, it is necessary to operate color compensation before SPA to enhance the performance of the model. The model accuracy can be compared by the F-test (Filgueiras et al., 2014), according to the equation F p ¼ RMSEP21 =RMSEP22 , where RMSEP1 > RMSEP2 , The calculated F p value is compared with the value of the Fisher-Snedecor distribution with degrees of freedom equal to the number of prediction samples and an adopted significance level of 10% (a = 0.10). If the tabulated value of the Fstatistic is less thanF p , there is no statistical evidence of homogeneity of the values, and the method with RMSEP2 present better accuracy. The F-test applied to the RMSEPs of these models showed that the errors are not similar at a 0.10 level of ignificance (F p = 1.47 > F-statistic = 1.44). This finding is an indication that the siPLS-SPA variable selection method fusion with color parameters in SWNIR spectra range produces more accurate results. 3.4. Modeling based on nonlinear regression 3.4.1. PCA-ANN models Principal component analysis (PCA) creates components to explain the observed variability in the predictor variables, without considering the response variable at all. In those PCA-ANN models, the spectral data were first analyzed by PCA to reduce repetition and redundancy in the input data. Then, the scores of the principal compounds (PCs) were chosen as input nodes for the input layer instead of the spectral data. In addition, noise and random error in the original spectra data would be excluded by the use of these scores. In order to achieve the optimal performance that compromises both under-fitting and overfitting results, the number of PCs was determined to be 5 for SWNIR. Based on the
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
SWNIR
LWNIR
17
17 R c =0.9339 RMSEC=0.4470 R p=0.8742 RMSEP=0.5524
16
15
Predicted value (%)
Predicted value (%)
ANN
16
14 13 12 Calibration data Prediction data
10 10
11
12
13 14 15 Reference value (%)
16
Predicted value (%)
Predicted value (%)
ANN-Lab
16
13 12 Calibration data Prediction data
10 10
11
12
13 14 15 Reference value (%)
16
17
13 14 15 Reference value (%)
16
17
p
15 14 13
Calibration data Prediction data 11
12
13 14 15 Reference value (%)
16
17
17 R =0.9289 RMSEC=0.4674 c R =0.9057 RMSEP=0.4800
16
p
15 14 13 12
R c=0.9398 RMSEC=0.4281 R =0.9110 RMSEP=0.4716 p
15 14 13 12
11
Calibration data Prediction data
10 10
11
12
13 14 15 Reference Values (%)
16
11
Calibration data Prediction data
10 10
17
17
11
12
13 14 15 Reference value (%)
16
17
17 R =0.9379 RMSEC=0.4350 c R =0.9263 RMSEP=0.4325
16
p
15
Predictred value (%)
Predicted value (%)
12
R c=0.9365 RMSEC=0.4374 R =0.9092 RMSEP=0.4724
10 10
Predicted value (%)
Predicted value (%)
LS-SVM
11
11
17
LS-SVM-Lab
Calibration data Prediction data
12
11
PCA
13
17 R C =0.9412 RMSEC=0.4215 R p=0.9194 RMSEP=0.4525
14
14 13 12 11 10 10
14
10 10
17
15
16
15
11
17
16
R =0.9319 RMSEC=0.4588 c R p=0.9185 RMSEP=0.4556
12
11
16
87
12
13 14 15 Reference Values (%)
16
P
15 14 13 12 11
Calibration data Prediction data 11
R =0.9435 RMSEC=0.4716 c R =0.9272 RMSEP=0.4273
17
10 10
Calibration da ta Prediction da ta 11
12
13 14 15 Reference value (%)
Fig. 3. Predicted versus reference values for nonlinear regression models for both SWNIR and LWNIR.
16
17
88
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
results, the first five PCs for SWNIR accounted for 99.05% of the variance. Similar to SWNIR, PCA of the LWNIR spectra resulted in a cumulative variance contribution rate of 99.01% based on the first four PCs. The number of hidden neurons and the transfer functions play an important role in proper training of ANN models. The number of
hidden neurons is related to the converging performance of the output error function during the learning process; here, in this paper, it was optimized as 4. The tan sigmoid transfer function followed by the purelin are commonly used for nonlinear systems. Fig. 3 shows the results of nonlinear regression models for SWNIR and LWNIR. For SWNIR, using PCs scores in a 5:4:1 architecture
17
17
16
p
15
Predicted value (%)
Predicted value (%)
ANN
16
R =0.9309 RMSEC=0.4546 c R =0.9114 RMSEP=0.4762
14 13 12
11
12
13 14 15 Reference value (%)
16
Predicted value (%)
Predicted value (%)
ANN-Lab
16
15 14 13 12
12
ICA
11
13 14 15 Reference value (%)
16
16
17
R =0.9555 RMSEC=0.3668 c R =0.9362 RMSEP=0.4001 p
14 13 12
16
p
15
Predicted value (%)
Predicted value (%)
13 14 15 Reference Values (%)
Calibration data Prediction data 11
12
13 14 15 Reference Values (%)
16
17
17 R =0.9361 RMSEC=0.4422 c R =0.9133 RMSEP=0.4706
14 13
14 13 12
11
11
11
12
13 14 15 Reference value (%)
16
10 10
17
17
Rc =0.9499 RMSEC=0.3940 Rp=0.9604 RMSEP=0.3691
15
12
10 10
Calibration data Prediction data 11
12
13 14 15 Reference value (%)
16
17
17 R =0.9525 RMSEC=0.3802 c R =0.9398 RMSEP=0.3870 p
16
15
Predicted value (%)
Predicted value (%)
12
15
10 10
17
17
14 13
R =0.9604 RMSEC=0.3553 c R =0.9455 RMSEP=0.3691 p
15 14 13 12
12 11 10 10
11
11
Calibration data Prediction data
10 10
16
Calibration data Prediction data
17 R c =0.9481 RMSEC=0.3960 R p=0.9297 RMSEP=0.4225
11
LS-SVM
13
10 10
17
17
LS-SVM-Lab
14
11
Calibration data Prediction data
10 10
16
15
12
11
16
R =0.9362 RMSEC=0.4370 c R p=0.9240 RMSEP=0.4341
11
Calibration data Prediction data 11
12
13 14 15 Reference value (%)
16
17
Fig. 3. (Continued)
10 10
Calibration data Prediction data 11
12
13 14 15 Reference value (%)
16
17
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
with four nodes in the inner layer, resulted in Rc = 0.9339, RMSEC = 0.4470%, Rp = 0.8742 and RMSEP = 0.5542%. For LWNIR, using PCs scores in a 4:4:1 architecture with four nodes in the inner layer, similar results as SWNIR were obtained. Color space coordinates, also as input nodes, combined with PCs were required to build the PCA-ANN models. The developed fusion PCA-ANN models were tested for validity and predictive ability with an independent prediction set. Color compensations had positive effect on SWNIR model performance and negative effect on LWNIR model performance. 3.4.2. PCA-SVM models Just as the PCA-ANN calibration model, the LS-SVM regression also considered PC scores as input vectors. Before developing the LS-SVM model, the regularization parameter g and the RBF kernel parameter s 2 needed to be optimized. The optimum parameters of LS-SVM are very important for they determine the learning ability, prediction ability and generalization ability for establishing the calibration model (Ji-yong et al., 2013). g determines the trade-off between the training error and model simplicity. s 2 is the bandwidth and implicitly defines the nonlinear mapping from input space to some high dimensional feature space (Chauchard et al., 2004). Selection of the optimum parameters for LS-SVM is tricky, sometime the combination of parameters give the lowest RMSECV, but give higher RMSEP or vice versa i.e. underfitting. The optimum values for g and s 2 were obtained by running a two-step grid search based on cross validation to give the RMSECV. Grid search is a two-dimensional minimization procedure based on exhaustive search in a limited range. The first step of grid search was a crude search with a large step size, and the second step for the specified search used a small step size. Finally, the optimal values of g and s 2 were found at the value of g = 16.544 and s 2 = 277.827for SWNIR, g = 18.412 and s 2 = 0.00442 for LWNIR, respectively. Similar to the PCA-ANN fusion model, color space coordinates combined with PCs were used to build the PCA-SVM models. In color fusion PCASVM models, the optimal pair of (g ,s 2) were found at the value of g = 150.355 and s 2 = 143.308 for SWNIR; g = 51.111 and s 2 = 0.00313 for LWNIR, respectively. From the Table 2, the fusion PCA-SVM models appear to yield better results for the calibration set for both SWNIR and LWNIR. These results are corroborated by the graphs of predicted versus reference values in Fig. 3. A very good predictive ability of SSC of apples was found, with Rp more than 0.90 and RMSEP less than 0.48% for the prediction set. 3.4.3. ICA-ANN models In this proposed method, we use fast ICA to transform the input space composed of NIR spectra data into the feature space consisting of independent components representing underlying information of the original data. Then, the ICs are served as the input variables of the neural network to build a prediction model. Before the ICA can be applied, the NIR spectra data should be centered and whitened. This can be done using an eigenvalue decomposition of the covariance matrix of the input. The new component with the highest eigenvalue indicated the most important principal component of the training data set. ICA was implemented to produce ICs by maximizing the statistical independence of the estimated components. Following fast ICA, six and five ICs were obtained for SWNIR and LWNIR, respectively. Although the ICs are considered to be latent variables, they can be used as feature information for input of neural networks. The neural network parameters were optimized by the minimal RMSEC value. Both learning rate and momentum were set as 0.1; the initial weights were 0.3, the hyperbolic-tangent function was used. The permitted regression error was set as 0.01 and the
89
maximal time of training was 2000. The values of RMSEC of ICA– ANN were 0.4546% and 0.4370% for SWNIR and LWNIR respectively. Considering the color as compensation factor, ICs combined color space coordinates were utilized to construct the fusion models. The fusion ICA-ANN models show better results for predicting SSC in apple using both SWNIR and LWNIR. The results indicated that ICA is a very effective multivariate data analysis tool that can be applied to enhance feature information and reduce data dimensionality. 3.4.4. ICA-SVM models After the application of ICA, LS-SVM models were developed to determine the SSC of apple. The optimal combination of (g , s 2) were found at the value of g = 1.355 and s 2 = 58.557 for SWNIR; g = 19.629 and s 2 = 52.143 for LWNIR, respectively. As can be seen, all ICA-SVM models were better than the linear models since the LS-SVM model had a higher correlation coefficient and lower RMSEC values. From this point of view, the proposed latent variable selection methods were quite helpful and useful for potential applications. Compensation models based on ICA-SVM were developed to eliminate the effect of color. The optimal combinations of (g , s 2) were (1.603, 24.779), and (25.662, 100.484) for ICA-SVM fusion models to SWNIR and LWNIR, respectively. Comparing these models, ICA-SVM fusion models obtained the best results for calibration and prediction sets. ICA-SVM fusion models obtained excellent performance with R values higher than 0.9398 and RMSEC values lower than 0.3870%. Only considering the performance of prediction set, the ICA-SVM fusion model for LWNIR achieved the best performance with Rp = 0.9455 and RMSEP = 0.3691%. According to F-test results, the usage of color compensation could significantly improve the performance of the ICA-SVM model. 4. Discussion NIR spectroscopy has been a powerful tool for quality detection and process control in the agricultural and food industries. Numerous studies have been reported in recent years on NIR spectroscopy for fast measurement of SSC and other quality attributes of apple. In the study of Eisenstecken et al. (2015), an SSC model was built within the range of 1000–2500 nm, and the SSC prediction model showed inadequate coefficients of determination. Additionally, Peirs et al. (2001) and Kumar et al. (2015) included wavelengths in the visible range to increase the model accuracy. Moreover, an automatic apple rotation was successfully applied to improve the determination of quality characteristics (Schmutzler and Huck, 2014). They obtained SECV values of 0.45% and 0.46% for the SSC of Pink Lady and Golden Delicious apples, respectively. Compared to previous investigations, the models built in this work had an excellent performance. In addition, color factor has not been considered to compensate quality model. This work has shown a great potential for obtaining reliable predictions of SSC of apple, both by SWNIR and LWIR as a nondestructive tool. Table 2 shows the summary results for SSC using various calibration methods. Compared with conventional PLS models, the variable selection procedures and color compensation method significantly improved the performance of the final model. From the results we can conclude that (1) nonlinear models are superior to linear models; (2) the ICA algorithm has a better capacity to select variables for modeling; (3) color compensation may further improve the performance of the final model. As to the models developed for SSC, the LWNIR range was slightly better compared to the SWNIR range. However, no significant differences
90
Z. Guo et al. / Postharvest Biology and Technology 115 (2016) 81–90
in prediction ability were observed between the two spectral range. Acknowledgements The authors acknowledge the financial support provided by the National Natural Science Foundation for Young Scientists of China (31501216), Advanced Talents Science Foundation of Jiangsu University (15JDG169) and the Natural Science Foundation of Jiangsu Province (Youth) (BK20150502). References Alfatni, M., Shariff, A., Abdullah, M., Marhaban, M., Saaed, O., 2013. The application of internal grading system technologies for agricultural products—review. J. Food Eng. 116 (3), 703–725. Bobelyn, E., Serban, A.S., Nicu, M., Lammertyn, J., Nicolai, B.M., Saeys, W., 2010. Postharvest quality of apple predicted by NIR-spectroscopy: study of the effect of biological variability on spectra and model performance. Postharvest Biol. Technol. 55 (3), 133–143. Bochereau, L., Bourgine, P., Palagos, B., 1992. A method for prediction by combining data analysis and neural networks: Application to prediction of apple quality using near infra-red spectral. J. Agric. Eng. Res. 51, 207–216. Bureau, S., Ruiz, D., Reich, M., Gouble, B., Bertrand, D., Audergon, J.M., Renard, C.M., 2009. Rapid and non-destructive analysis of apricot fruit quality using FT-nearinfrared spectroscopy. Food Chem. 113 (4), 1323–1328. Handbook of Near-Infrared Analysis. In: Burns, D.A., Ciurczak, E.W. (Eds.), CRC Press. Butz, P., Hofmann, C., Tauscher, B., 2005. Recent developments in noninvasive techniques for fresh fruit and vegetable internal quality analysis. J. Food Sci. 70 (9), R131–R141. Centner, V., Massart, D.L., Noord, O.E., 1996. Detection of in homogeneities in sets of NIR spectra. Anal. Chim. Acta 330 (1), 1–17. Chauchard, F., Cogdill, R., Roussel, S., Roger, J.M., Bellon-Maurel, V., 2004. Application of LS-SVM to non-linear phenomena in NIR spectroscopy: development of a robust and portable sensor for acidity prediction in grapes. Chemom. Intell. Lab. Syst. 71 (2), 141–150. Chen, Q., Ding, J., Cai, J., Zhao, J., 2012. Rapid measurement of total acid content (TAC) in vinegar using near infrared spectroscopy based on efficient variables selection algorithm and nonlinear regression tools. Food Chem. 135 (2), 590–595. Chen, Q., Zhang, C., Zhao, J., Ouyang, Q., 2013. Recent advances in emerging imaging techniques for non-destructive detection of food quality and safety. Trac-Trend Anal. Chem. 52, 261–274. Chen, Q., Zhang, D., Pan, W., Ouyang, Q., Li, H., Urmila, K., Zhao, J., 2015. Recent developments of green analytical techniques in analysis of tea’s quality and nutrition. Trends Food Sci. Tech. 43 (1), 63–82. Chia, K.S., Abdul Rahim, H., Abdul Rahim, R., 2012. Prediction of soluble solids content of pineapple via non-invasive low cost visible and shortwave near infrared spectroscopy and artificial neural network. Biosyst. Eng. 113 (2), 158–165. Delgado-Pelayo, R., Gallardo-Guerrero, L., Hornero-Méndez, D., 2014. Chlorophyll and carotenoid pigments in the peel and flesh of commercial apple fruit varieties. Food Res. Int. 65, 272–281. Drogoudi, P.D., Pantelidis, G., 2011. Effects of position on canopy and harvest time on fruit physico-chemical and antioxidant properties in different apple cultivars. Sci. Hortic. 129 (4), 752–760. Eisenstecken, D., Panarese, A., Robatscher, P., Huck, C.W., Zanella, A., Oberhuber, M., 2015. A near infrared spectroscopy (NIRS) and chemometric approach to improve apple fruit quality management: a case study on the cultivars Cripps Pink and Braeburn. Molecules 20 (8), 13603–13619. Filgueiras, P.R., Alves, J.C.L., Poppi, R.J., 2014. Quantification of animal fat biodiesel in soybean biodiesel and B20 diesel blends using near infrared spectroscopy and synergy interval support vector regression. Talanta 119, 582–589. Guo, Z., Chen, Q., Chen, L., Huang, W., Zhang, C., Zhao, C., 2011. Optimization of informative spectral variables for the quantification of EGCG in green tea using Fourier transform near-infrared (FT-NIR) spectroscopy and multivariate calibration. Appl. Spectrosc. 65 (9), 1062–1067. Huck-Pezzei, V.A., Seitz, I., Karer, R., Schmutzler, M., De Benedictis, L., Wild, B., Huck, C.W., 2014. Alps food authentication, typicality and intrinsic quality by near infrared spectroscopy. Food Res. Int. 62, 984–990. Hyvärinen, A., Oja, E., 1997. A fast fixed—point algorithm for independent component analysis. Neural comput. 9 (7), 1483–1492. Iglesias, I., Echeverría, G., Lopez, M.L., 2012. Fruit color development, anthocyanin content, standard quality, volatile compound emissions and consumer acceptability of several ‘Fuji’apple strains. Sci. Hort. 137, 138–147. Janik, L.J., Cozzolino, D., Dambergs, R., Cynkar, W., Gishen, M., 2007. The prediction of total anthocyanin concentration in red-grape homogenates using visiblenear-infrared spectroscopy and artificial neural networks. Anal. Chim. Acta 594 (1), 107–118. Jha, S.N., Garg, R., 2010. Non-destructive prediction of quality of intact apple using near infrared spectroscopy. J. Food Sci. Technol. 47 (2), 207–213.
Ji-yong, S., Xiao-bo, Z., Xiao-wei, H., Jie-wen, Z., Yanxiao, L., Limin, H., Jianchun, Z., 2013. Rapid detecting total acid content and classifying different types of vinegar based on near infrared spectroscopy and least-squares support vector machine. Food Chem. 138 (1), 192–199. Jiang, H.Y., Xie, L.J., Peng, Y.S., Ying, Y.B., 2008. Study on the influence of temperature on near infrared spectra. Spectrosc. Spectr. Anal. 28 (7), 1510–1513. Kumar, S., McGlone, A., Whitworth, C., Volz, R., 2015. Postharvest performance of apple phenotypes predicted by near-infrared (NIR) spectral analysis. Postharvest Biol. Technol. 100, 16–22. Lammertyn, J., Nicolaï, B., Ooms, K., De Smedt, V., De Baerdemaeker, J., 1998. Nondestructive measurement of acidity, soluble solids, and firmness of Jonagold apples using NIR-spectroscopy. Trans. ASAE 41 (4), 1089–1094. Li, Y., Shao, X., Cai, W., 2007. A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples. Talanta 72 (1), 217–222. Liu, Y., Ying, Y., 2005. Use of FT-NIR spectrometry in non-invasive measurements of internal quality of ‘Fuji’apples. Postharvest Biol.Technol. 37 (1), 65–71. Lu, R., Guyer, D.E., Beaudry, R.M., 2000. Determination of firmness and sugar content of apples using near-infrared diffuse reflectance. J. Texture Stud. 31 (6), 615–630. Magwaza, L.S., Opara, U.L., Nieuwoudt, H., Cronje, P.J., Saeys, W., Nicolaï, B., 2012. NIR spectroscopy applications for internal and external quality analysis of citrus fruit—a review. Food Bioprocess Technol. 5 (2), 425–444. Martínez Vega, M.V., Sharifzadeh, S., Wulfsohn, D., Skov, T., Clemmensen, L.H., Toldam-Andersen, T.B., 2013. A sampling approach for predicting the eating quality of apples using visible-near infrared spectroscopy. J. Sci. Food Agr. 93 (15), 3710–3719. Mehinagic, E., Royer, G., Bertrand, D., Symoneaux, R., Laurens, F., Jourjon, F., 2003. Relationship between sensory analysis, penetrometry and visible-NIR spectroscopy of apples belonging to different cultivars. Food Qual. Preference 14 (5), 473–484. Mendoza, F., Lu, R., Cen, H., 2014. Grading of apples based on firmness and soluble solids content using Vis/SWNIR spectroscopy and spectral scattering techniques. J. Food Eng. 125, 59–68. Merzlyak, M.N., Solovchenko, A.E., Gitelson, A.A., 2003. Reflectance spectral features and non-destructive estimation of chlorophyll, carotenoid and anthocyanin content in apple fruit. Postharvest Biol. Technol. 27 (2), 197–211. Naguib, I.A., Abdelaleem, E.A., Draz, M.E., Zaazaa, H.E., 2014. Linear support vector regression and partial least squares chemometric models for determination of hydrochlorothiazide and benazepril hydrochloride in presence of related impurities: a comparative study. Spectrochim. Acta A Mol. Biomol. Spectrosc. 130, 350–356. Nicolaï, B.M., Beullens, K., Bobelyn, E., Peirs, A., Saeys, W., Theron, K.I., Lammertyn, J., 2007. Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: a review. Postharvest Biol. Technol. 46 (2), 99–118. Nicolaï, B.M., Defraeye, T., De Ketelaere, B., Herremans, E., Hertog, M.L., Saeys, W., Verboven, P., 2014. Nondestructive measurement of fruit and vegetable quality. Annu. Rev. Food Sci. Technol. 5, 285–312. Ozaki, Y., 2012. Near-infrared spectroscopy—Its versatility in analytical chemistry. Anal. Sci. 28 (6), 545–563. Peirs, A., Lammertyn, J., Ooms, K., Nicolaı̈, B.M., 2001. Prediction of the optimal picking date of different apple cultivars by means of VIS/NIR-spectroscopy. Postharvest Biol. Technol. 21 (2), 189–199. Qin, J., Lu, R., 2008. Measurement of the optical properties of fruits and vegetables using spatially resolved hyperspectral diffuse reflectance imaging technique. Postharvest Biol. Technol. 49 (3), 355–365. Roger, J.M., Chauchard, F., Bellon-Maurel, V., 2003. EPO–PLS external parameter orthogonalisation of PLS application to temperature—independent measurement of sugar content of intact fruits. Chemom. Intell. Lab. Syst. 66 (2), 191–204. Schmutzler, M., Huck, C.W., 2014. Automatic sample rotation for simultaneous determination of geographical origin and quality characteristics of apples based on near infrared spectroscopy (NIRS). Vib. Spectrosc. 72, 97–104. Segtnan, V.H., Mevik, B.H., Isaksson, T., Naes, T., 2005. Low-cost approaches to robust temperature compensation in near-infrared calibration and prediction situations. Appl. Spectrosc. 59 (6), 816–825. Near-infrared spectroscopy: principles, instruments, applications. Siesler, H.W., Ozaki, Y., Kawata, S., Heise, H.M. (Eds.), John Wiley & Sons . Soares, S.F.C., Gomes, A.A., Araujo, M.C.U., Galvão Filho, A.R., Galvão, R.K.H., 2013. The successive projections algorithm. Trac-Trend Anal. Chem. 42, 84–98. VapnikV, N., 1998. Statistical learning theory, Vol. 2. Wiley, New York. Wu, D., He, Y., Nie, P., Cao, F., Bao, Y., 2010. Hybrid variable selection in visible and near-infrared spectral analysis for non-invasive quality determination of grape juice. Anal. Chim. Acta 659 (1), 229–237. Xiaobo, Z., Jiewen, Z., Xingyi, H., Yanxiao, L., 2007. Use of FT-NIR spectrometry in non-invasive measurements of soluble solids contents (TSSC) of ‘Fuji’apple based on different PLS models. Chemom. Intell. Lab. Syst. 87 (1), 43–51. Yao, Y., Chen, H., Xie, L., Rao, X., 2013. Assessing the temperature influence on the soluble solids content of watermelon juice as measured by visible and nearinfrared spectroscopy and chemometrics. J. Food Eng. 119 (1), 22–27. Zhang, B., Huang, W., Li, J., Zhao, C., Fan, S., Wu, J., Liu, C., 2014. Principles, developments and applications of computer vision for external quality inspection of fruits and vegetables: a review. Food Res. Int. 62, 326–343.