Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
Contents lists available at ScienceDirect
Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy journal homepage: www.elsevier.com/locate/saa
A correlation-analysis-based wavelength selection method for calibration transfer Zhuopin Xu a,b,1, Shuang Fan a,b,1, Weimin Cheng a,b, Jie Liu a,b, Pengfei Zhang a, Yang Yang a, Cong Xu a,b, Binmei Liu a, Jing Liu a, Qi Wang a,⁎, Yuejin Wu a,⁎ a Key Laboratory of High Magnetic Field and Ion Beam Physical Biology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, 350 Shushanhu Road, Hefei, Anhui 230031, People's Republic of China b University of Science and Technology of China, No. 96 Jinzhai Road, Hefei, Anhui 230026, People's Republic of China
a r t i c l e
i n f o
Article history: Received 31 October 2019 Received in revised form 7 January 2020 Accepted 9 January 2020 Available online 11 January 2020 Keywords: Calibration transfer Wavelength selection Near-infrared spectroscopy
a b s t r a c t Considering that the spectral signals vary among different instruments, calibration transfer is required for further popularization and application of the near-infrared spectroscopy (NIRS). To achieve good calibration transfer results, spectral variables with stable and consistent signals between instruments and containing the target component information should be selected. In this study, a correlation-analysis-based wavelength selection method (CAWS) is proposed for calibration transfer. This method relies on the selection of wavelengths at which the spectral responses of master and slave instruments are well correlated (high absolute values of Pearson's correlation coefficient (|Ri|)). The proposed CAWS method was applied to two available datasets, corn and rice bran, and its calibration transfer performances were compared with other wavelength selection methods. The effects of pretreatment methods and calibration transfer algorithms were also assessed. The CAWS optimized models obtained lower root mean square errors of prediction (RMSEPtrans) after calibration transfer, suggesting that the proposed method is capable of effectively improving the efficiency of calibration transfer. Combinations of this method with other wavelength selection methods and calibration transfer algorithms may further enhance the efficiency of calibration transfer, and thus should be thoroughly investigated. © 2020 Elsevier B.V. All rights reserved.
1. Introduction Molecular spectroscopy, especially near infrared spectroscopy (NIRS) and Raman spectroscopy, has been widely used in the process analysis of biological and chemical products in the past few decades [1–4]. This progress is due to the combination of multivariate calibration techniques. A good multivariate calibration model is often timeconsuming and costly, so it is always expected to be stable and effective over the long term. However, many changes in measurement conditions, including the transfer of models from one instrument to another, the aging and replacement of instrument parts, as well as various changes from the external environment and samples, would possibly affect the accuracy and applicability of the calibration model [5–8]. This is because these interference factors and the target component information are often intricately overlapped in the spectrum and are difficult to be separated in modeling. Some studies devote to integrate data acquired under new conditions into the old model to achieve model ⁎ Corresponding authors. E-mail addresses:
[email protected] (Q. Wang),
[email protected] (Y. Wu). 1 These authors contributed equally to this work and should be considered co-first authors.
https://doi.org/10.1016/j.saa.2020.118053 1386-1425/© 2020 Elsevier B.V. All rights reserved.
updating (MU) [5], while more studies focus on calibration transfer. Calibration transfer refers to various chemometric methods that make the calibration model adapt to new instruments or detection conditions [9]. The transferred model is often less accurate than the original one applied in the master instrument, but it is still useful in that it saves the time and costs needed for remodeling. The development of calibration transfer method promotes the further popularization and application of spectroscopy in various fields [10–13]. In this contribution we will focus on the transfer of near-infrared models between spectral instruments. Calibration transfer methods usually rely on constructing a mapping relation between the data (including the spectra or the predicted results) of standard samples acquired on the master and slave instruments. This relation is then fitted as a transfer function, so as to match the signals of the new instrument with the original ones. Traditional transfer methods such as slope-bias correction (SBC) [14], direct standardization (DS) [15], and piecewise direct standardization (PDS) [16] algorithms have been widely applied and are often compared with new methods. Some new transfer methods are based on factor analysis, which separate spectral information-related factors from noise to optimize the transfer results. The singular value decomposition based spectral space transformation algorithm (SST) [17], the alternating three-
2
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
dimensional linear decomposition based ATLD algorithm [18], the canonical correlation analysis based CCA algorithm [19], and the spectral regression method [20] which introducing data into the regression framework for dimension reduction, are all developed for calibration transfer. In these reports, the methods have achieved results close to or even better than PDS. Some other new approaches are based on machine learning, such as moving window support vector regression machine (SVR) [21] and extreme learning machine auto-encoder (TEAM) [22]. These methods have the advantage of nonlinearity in describing the mapping relationship between the signals before and after the transfer. In addition to the above methods, many new methods are based on other principles. Folch-fortuny et al. [23] regarded calibration transfer from one instrument to another as a missing data imputation problem, and adopted two algorithms, trimmed scores regression (TSR) and joint-Y partial least squares regression (JYPLS), to realize the reconstruction of slave spectrum. Based on PDS and PRS (piecewise reverse standardization), Liang et al. [24] and Sun et al. [25] introduced two different sample partition methods, Rank-KS and Rank-spxy, respectively, to improve the transfer efficiency via improving the representativeness of standard samples to the validation samples. Some changes in the measurement conditions, such as changes in the light source, may render the standard samples to be unable to be measured under the primary condition. In addition, the standard samples are often required to have similar physical and chemical properties with the products to be tested [26]. To eliminate the complications implicated in selecting and applying the standard samples, many calibration transfer algorithms that do not require such compounds have been recently proposed. Liu et al. [27] assumed that the spectra of samples with similar physical and chemical properties measured on different instruments were linearly correlated, and thus developed a standard-free algorithm (linear model correction, LMC). Since the multiplicative scatter correction (MSC) can reduce the scattering effect in the reflection spectrum, K.E. Kramer et al. [28] proposed two methods (windowed MSC and moving window MSC) based on this pretreatment method to realize the calibration transfer between two different instruments without standards. B. Malli et al. [29] evaluated the performance of a variety of calibration transfer methods under the condition of no real standards or only a few standards, and ranked the transfer performance of these methods, some of which referred to methods in machine learning (e.g., transfer component analysis (TCA)). In addition to new calibration transfer methods, another way to improve the transfer efficiency is to select appropriate spectral variables. Firstly, these spectral variables should contain characteristic bands related to the target components. Many methods are used to remove background noise and irrelevant signals, and retain the characteristic bands [30–32]. These methods are often applied in modeling. Since differences between instruments can be classified as irrelevant signals, the methods are also applicable for calibration transfer. H. Swierenga et al. [33] used simulated annealing (SA) algorithm to screen moisturerelated spectral variables in pills, and obtained better calibration transfer results than the DS and PDS methods. Zhang et al. [34] adopted stability competitive adaptive reweighted sampling (SCARS) algorithm for wavelength selection, and realized the transfer of corn models without standards. Secondly, wavelength selection is also related to the response between instruments. It is well known that the spectra of the same sample varied with different spectrometers. The differences between these spectra vary with wavelengths, that is, the consistency of signal response between instruments varies at different wavelengths. Selecting the wavelengths with high inter-instrument signal consistency is beneficial for achieving good transfer results. The screening wavelengths with consistent and stable signals (SWCSS) algorithm designed by L. Ni et al. [35] is an example of a calibration transfer method. It screens consistent and stable wavelengths of signals by evaluating the relative values of the standard deviations of the inter-instrument spectra at different wavelengths, and achieves results superior to the PDS method.
However, considering the influence of the above two factors on the calibration transfer, the ideal wavelengths needs to take into account both the consistency of signals between master and slave instruments, and at the same time should contain sufficient spectral information of target component. The research of L. Zhang et al. [36] manifests this thought. By combining SWCSS algorithm with other wavelength screening methods such as uninformative variable elimination (UVE), variable importance in projection (VIP), and selectivity ratio (SR), the wavelengths are further selected to realize robust calibration transfer. More optimized methods to achieve this goal need to be studied. In this research, we propose a new correlation analysis based wavelength selection (CAWS) method for the screening and selection of wavelengths with good calibration transfer results. To analyze the consistency of standard spectra measured by master and slave instruments, we used the absolute value of Pearson's correlation coefficient (|Ri|). Wavelengths with high |Ri| values were selected for modeling and calibration transfer, and a batch of standardized samples (called CAWS standard samples) that measured on both the master and slave instruments were evaluated to ensure a close relationship between wavelengths and the target component. The model constructed according to these wavelengths is not optimal for the master instrument, but the calibration transfer result is better because it is specifically designed for the slave instrument. In addition, as a wavelength selection method, CAWS can be combined with other methods and calibration transfer algorithms. The performance of the proposed CAWS method was assessed for corn and rice bran datasets and compared to those of other wavelength selection methods, under different pretreatment and calibration algorithm conditions. The obtained results were used to verify whether the optimized models yielded the best calibration transfer results, as well as to confirm the superiority of the CAWS method. 2. Materials and methods 2.1. Correlation analysis wavelength selection method (CAWS) The CAWS method proposed in this study is based on the assumption that the selection of wavelengths at which the values of |Ri| (correlation coefficient between spectral signals measured using the master and slave instruments) are relatively large will lead to better predicted results in calibration transfer. Since many calibration transfer algorithms rely on the linear transformation from data (spectra or predicted results) obtained using the slave instrument to those determined by the master instrument, it is expected that the spectra with highly correlated wavelengths will work well with these algorithms. The procedure of CAWS optimization is illustrated in Fig. 1. First, the NIR spectra of standard samples were recorded on the master and slave instruments; then, the absolute values of Pearson's correlation coefficient [37] were calculated for each wavelength (i), as per Eq. (1): Pm a −a −b b i ij i j¼1 ij ; jRi j ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 Pm Pm j¼1 aij −ai j¼1 bij −bi
ð1Þ
where m is the number of standard samples, aij and bij are the absorbance values of the jth sample at the ith wavelength measured using the master and slave instruments, respectively, and ai and bi are the average absorbance values of all samples measured at the ith wavelength using the master and slave instruments, respectively. The Corr curve (Corr = {|R1|,|R2|…|Ri|…|Rn|}, where n is the number of wavelengths) was obtained by plotting the calculated |Ri| values as a function of wavelength. Assuming that |R|max represents the maximum of the Corr curve, the optimal threshold correlation coefficient, T (T∈(0, |R|max)), was determined by screening. All wavelengths at which the corresponding |Ri|
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
3
Fig. 1. Flowchart of CAWS optimization for calibration transfer.
values were found to be greater than or equal to T were considered as characteristic wavelengths and selected for modeling and calibration transfer using the spectra recorded on the master and slave instruments, respectively. The root mean square errors of the results predicted by the optimized model after calibration transfer are denoted as RMSEPtrans. The process of threshold screening aims to ensure that the selected wavelengths are suitable for calibration transfer. A higher threshold value is advantageous in that it leads to the selection of wavelengths at which the responses of the two instruments are more consistent. However, large thresholds are associated with the risk of eliminating the wavelengths corresponding to the target constituent. The screening method adopted in this study minimizes such risk. Briefly, the wavelength threshold was gradually increased from 0 by increments of k (set at 0.001), up to the point where the number of |Ri| N T was found to be less than or equal to the number of partial least squares (PLS) factors. The wavelengths selected on the main instrument calibration spectra for modeling varied for different threshold values. A well-selected subset of each calibration set, named CAWS standard samples, was then used to evaluate the calibration transfer performance. This was achieved by transferring the models to the slave instrument and using them to predict the spectra of the CAWS standard. The predicted results were subsequently compared to the reference values of these samples, and the root mean square errors (RMSES) of the CAWS standard samples were calculated. The threshold value yielding minimum RMSES was taken as the optimal threshold T. The CAWS method adopted in our study can be combined with other wavelength selection strategies, such as synergy interval partial least squares (siPLS) [38]. Based on this method, the spectra are divided into p segments of equal length, and the spectral data of q segments (q b p) are combined for PLS operation. The optimal segment combination (minimum RMSECV) determined by siPLS is then used by CAWS for wavelength selection. This combined method is named si-CAWS. To verify the validity of the CAWS selected wavelengths, the proposed method is compared to the reversed-CAWS method wherein the criterion for wavelength selection is |Ri| b T. Based on this criterion, reversed-CAWS selects all wavelengths not selected by CAWS. 2.2. Datasets The CAWS method proposed herein was validated for two datasets, corn and rice bran. The corn dataset is available at http://software. eigenvector.com/Data/Corn/index.html and applied in many research studies [17–21,23,27,29,34,35]. Meanwhile, the rice bran dataset is
specifically designed for the NIR analysis of rice milling degree. Since white rice is rich in starch, the starch content in rice bran will increase with the increase in rice milling degree. Therefore, by detecting the starch content in rice bran, the milling degree of rice can be nondestructively and quantitatively analyzed. 2.2.1. Corn dataset The corn dataset comprises the NIR spectra of 80 corn samples measured using two different spectrometers (mp5 and mp6). The mp5 was designated as the master instrument, while the mp6 was termed the slave instrument. The spectra collected on both systems spanned the range of 1100–2498 nm, with 2 nm intervals (700 spectral variables). The corn oil content detection model built on the mp5 spectrometer was transferred to the mp6 system. Reference values of oil content in each sample (3.09%–3.83%) were also provided. 2.2.2. Rice bran dataset The spectra of a total of 80 rice bran samples, with different milling degrees, were provided. Of the 80 samples, 43 were collected from the Shuanghu rice factory in Anhui province, China, while the rest were obtained by processing 37 rice samples of different genetic backgrounds. The processed rice was harvested, dehusked, and milled at the science island of Hefei in Anhui province, China. Circular quartz dishes that are 24 mm in depth and 20 mm in diameter were filled with rice bran samples and placed on the detection window of a Bruker MPA Fourier transform near-infrared spectrometer (Bruker, Germany). The NIR diffuse reflection spectrum of each sample was collected once by averaging a total of 32 scans performed in the range of 1200–2300 nm, at 1.074 nm intervals (1024 spectral variables). The same samples were deposited into 10 mm deep glass plates (diameter = 17 mm) for analysis using a Luminar 3076 AOTF NIR spectrometer (Brimrose, USA). Each NIR diffuse reflection spectrum collected on this instrument constituted the average of 10 scans recorded in the spectral range of 1200–2300 nm, at 1 nm intervals. The AOTF spectra were interpolated so that they had the same range and intervals as the Bruker spectra. The content of starch in the analyzed rice bran samples was determined by polarimetry, using a slightly modified version of the procedure detailed in Chinese National Standard GB 5006-1985 [39]. Briefly, 3 g of each sample was accurately weighed and extracted by Soxhlet [40]. Subsequently, 0.2 g of the extract was tested for moisture content by an oven-dry method [40], and another 1.25 g was centrifuged twice with 60% ethanol in a 50 mL centrifuge tube. The precipitate collected after centrifugation was transferred to a 100 mL triangular bottle and
4
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
mixed with 25 mL of calcium chloride acetic acid solution (pH = 2.3). The resulting mixture was kept in a 100 °C water bath for 40 min, then centrifuged in a 50 mL tube containing 0.5 mL of 30% zinc sulfate solution, 0.5 mL of 15% potassium ferrocyanide solution, and distilled water. Subsequently, the solution was filtered and analyzed by a WZZ1 automatic polarimeter (Shanghai INESA Physico-Optical Instrument Co., Ltd. China) in order to determine the content of starch, as per the following equation: Starch ð%Þ ¼
α 106 ; LWð100−HÞ 203
ð2Þ
where W is the weight of the rice bran sample (g), H is the moisture content (%), L is the length of the rotator (mm), and α is the reading of the polarimeter. The content of starch in the analyzed samples was found to be in the range of 8.82%–84.59%. The rice bran starch detection model built on the master Bruker spectrometer was transferred to the slave AOTF instrument. 2.2.3. Sample classification Based on the master instrument spectra and the reference values, the samples in each dataset were divided into 60 calibration and 20 validation samples by a spxy method [41]. The first 8 calibration samples were taken as standards for Corr curve construction and calibration transfer, and the first 30 calibration samples were considered as standards for CAWS optimization (the CAWS standard samples). 2.3. Data analysis Three calibration transfer algorithms, SBC [14], PDS [16], and SST (spectral space transformation) [17], were used to transfer the calibration models. SBC is a widely used algorithm that relies on the calibration of predicted values to realize transfer. It is based on the assumption that the responses of the master and slave instruments are linearly related. In order to approximate the values predicted by the slave instrument to those determined by the master system, SBC performs a reverse linear transformation using the calculated slope and bias of the linear curve. Meanwhile, the PDS algorithm calibrates the variance between the two instruments by assuming that the spectral responses are linearly correlated within a spectral segment (often called window) delimited by two wavelengths, one from the slave spectrum and one from the master spectrum. PLS or PCR (principle component regression) is used to fit the regression equation between the responses of the two instruments. Ultimately, a new spectrum similar to that of the master instrument is reconstructed by moving the window on the slave spectrum. In this study, PLS was used to fit the regression equation, and the default PLS factor number and window size were set to 1. Finally, the SST algorithm performs singular value decomposition on the standard spectra of master and slave instruments, thereby dividing them into spectral information and noise. When constructing the transfer matrix, the decomposed spectral information is transferred without the noise, resulting in reduced deviation upon changing the instrument or the operative conditions. The designation of decomposed data as spectral information or noise depends on the number of SST factors. Thus, this parameter should be optimized, while keeping in mind that it must not exceed the number of standard samples [42]. Twelve modeling and transfer assessments were conducted using each calibration transfer algorithm, based on different combinations of pretreatment and wavelength selection methods. In terms of pretreatment, the samples were either not pretreated at all or pretreated based on the standard normal variate transform (SNV) or first derivative (default set to 17 point smoothing) methods. As for wavelength selection, four methods were applied, namely the full spectrum, CAWS, siPLS, and si-CAWS methods. Herein, the spectra were divided into 10 equal segments by siPLS, but only 4 segments were considered for PLS regression. The maximum PLS factor number was set to 15. In total, 36
experimental groups were analyzed by different combinations of pretreatment, wavelength selection, and calibration transfer methods. For each one of these groups, the model yielding minimum RMSES was selected. For modeling purposes, all six pretreatment (no pretreatment, SNV, and first derivative) and calibration transfer (SBC, PDS, and SST) methods were assessed, but only two wavelength selection methods (full spectrum, siPLS) were considered. The group producing the lowest RMSECV values was deemed the optimal master instrument model and used as the control group. All data were processed on Matlab software, version 2015b (The Mathworks, USA). 3. Results and discussion 3.1. Calibration transfer results of two datasets The calibration results determined for each dataset may be divided into three categories of 12 experimental and one control group results, based on the calibration transfer algorithm used to analyze them. 3.1.1. Calibration transfer of the corn dataset The cross validation and predicted results of transferring the mp5 corn oil content detection model to mp6 are shown in Table 1. The results summarized in Table 1 indicate that within a particular category and for the same pretreatment method, the CAWS and siCAWS optimized groups exhibit lower RMSES values than the full spectrum and siPLS groups, respectively. A similar trend is observed for the RMSEPtrans values, albeit with some exceptions. For instance, the RMSES and RMSEPtrans values of groups 2–6 and 2–8 are lower than those of groups 2–5 and 2–7, respectively. The groups showing the lowest RMSES value (bold black values in Table 1) in each category are the same as those presenting the least RMSEPtrans (bold red values in Table 1). Namely, they are the 1–10, 2–2, and 3–10 groups, all of which are optimized by CAWS. Since the 3–10 group exhibits the lowest RMSES among all investigated groups, it is taken as the model with the best calibration transfer results. This group was pretreated using the first derivative method then analyzed by the CAWS wavelength selection method and SST calibration transfer algorithm. Yan Liu et al. [18] and Jiangtao Peng et al. [20] processed the same dataset (transferring the corn oil mp5 model to mp6) using different calibration transfer algorithms. The former group of researchers showed that for 54 calibration and 26 validation samples, the minimum RMSEPtrans value is 0.1519. Meanwhile, the optimal RMSEPtrans value calculated by Jiangtao Peng et al. using a more complex sample partitioning strategy is 0.098. Comparatively, a lower RMSEPtrans minimum is determined in this study (0.070), which indicates that our model is superior to those reported in the other two references. The control groups (1-CK, 2-CK, and 3-CK) in the three categories were subjected to the siPLS wavelength selection method (eight PLS factors), with no prior treatment. The RMSECV values of these groups were found to be lowest (0.093) among all the investigated groups. However, for a particular category, the control group yielded greater RMSES and RMSEPtrans values than many experimental groups. Interestingly, a comparison of the 2-CK and 2–3 groups that differ only in PLS factor number shows that the former has lower RMSECV and higher RMSES and RMSEPtrans values than the latter. 3.1.2. Calibration transfer of the rice bran dataset Table 2 summarizes the results of cross validation and AOTF transfer prediction for the rice bran Bruker starch content detection model. The data presented in Table 2 show a similar trend to that summarized in Table 1. Within the same category and for the same pretreatment conditions, the CAWS and si-CAWS optimized groups show lower RMSES values than the full spectrum and siPLS groups, respectively. Similarly, most RMSEPtrans values of the former groups are less than those of the latter groups. Overall, the 1–10, 2–10, and 3–10 groups
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
5
Table 1 Cross validation and predicted results of transferring the mp5 corn oil content detection model to mp6 under different conditions. Category
Group
Calibration transfera
Pretreatment
Wavelength selection
Spectral variables
RMSECV
RMSES
RMSEPtrans
Threshold
PLS factors
1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3
1–1 1–2 1–3 1–4 1–5 1–6 1–7 1–8 1–9 1–10 1–11 1–12 1-CK 2–1 2–2 2–3 2–4 2–5 2–6 2–7 2–8 2–9 2–10 2–11 2–12 2-CK 3–1 3–2 3–3 3–4 3–5 3–6 3–7 3–8 3–9 3–10 3–11 3–12 3-CK
SBC SBC SBC SBC SBC SBC SBC SBC SBC SBC SBC SBC SBC PDS PDS PDS PDS PDS PDS PDS PDS PDS PDS PDS PDS PDS SST8 SST8 SST3 SST3 SST8 SST8 SST3 SST3 SST4 SST4 SST2 SST2 SST8
None None None None SNV SNV SNV SNV 1st der 1st der 1st der 1st der None None None None None SNV SNV SNV SNV 1st der 1st der 1st der 1st der None None None None None SNV SNV SNV SNV 1st der 1st der 1st der 1st der None
None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS SiPLS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS SiPLS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS SiPLS
700 533 280 280 700 333 280 273 700 625 280 280 280 700 533 280 280 700 333 280 124 700 659 280 280 280 700 533 280 280 700 700 280 124 700 645 280 265 280
0.106 0.108 0.110 0.110 0.139 0.118 0.101 0.101 0.115 0.120 0.106 0.106 0.093 0.106 0.108 0.102 0.102 0.121 0.118 0.100 0.129 0.110 0.104 0.106 0.106 0.093 0.115 0.108 0.102 0.102 0.139 0.139 0.101 0.112 0.115 0.105 0.106 0.105 0.093
0.056 0.052 0.061 0.061 0.065 0.065 0.070 0.067 0.053 0.049 0.056 0.056 0.069 0.053 0.051 0.059 0.059 0.061 0.058 0.067 0.059 0.056 0.053 0.056 0.056 0.066 0.051 0.049 0.059 0.059 0.048 0.048 0.062 0.055 0.049 0.047 0.056 0.056 0.067
0.085 0.082 0.080 0.080 0.111 0.116 0.106 0.106 0.073
/ 0.999 / 0 / 0.998 / 0.972 / 0.900 / 0 / / 0.999 / 0 / 0.998 / 0.998 / 0.846 / 0 / / 0.999 / 0 / 0 / 0.998 / 0.874 / 0.876 /
12 12 14 14 13 12 11 10 12 13 8 8 8 12 12 11 11 11 12 8 15 10 10 8 8 8 14 12 11 11 13 13 8 13 12 10 8 8 8
0.092 0.092 0.099 0.082 0.093 0.093 0.105 0.096 0.108 0.079 0.081 0.078 0.093 0.093 0.094 0.070 0.076 0.102 0.102 0.083 0.083 0.098 0.080 0.071 0.090 0.093 0.091
Note: the lowest RMSES and RMSEPtrans values for each category are shown in bold black and red, respectively. a The subscripts in this column represent the number of SST factors.
optimized by CAWS achieved the lowest RMSES values within their corresponding categories (bold black values in Table 2). The 2–10 and 3–10 groups also had the least RMSEPtrans (bold red values in Table 2) in categories 2 and 3, respectively; however the group with the lowest RMSEPtrans in category 1 is the CAWS optimized 1–6 group. Among all investigated groups, 3–10 presented the least RMSES and thus was considered as the model with the best calibration transfer results. The non-pretreated, siPLS-analyzed (12 PLS factors) control groups (1-CK, 2-CK, and 3-CK) of categories 1, 2, and 3 showed a minimum RMSECV value of 6.545. However, similar to the corn dataset, the control groups of the rice bran dataset demonstrated greater RMSES and RMSEPtrans values than many experimental groups. Compared to the corresponding groups that differed only in the number of PLS factors (1–3, 2–3, and 3–3), the control groups of categories 1, 2, and 3 had lower RMSECV, but higher RMSES and RMSEPtrans values. 3.2. Comparison of wavelengths selected by CAWS and siPLS The wavelengths selected by the siPLS method are those showing good cross-validation results, since they are often closely related to the target constituent. CAWS, on the other hand, generally screens for wavelengths at which the responses of the master and slave instruments are more consistent. Despite the respective advantages of these two methods, one of them optimizes better. The calibration transfer results presented in Tables 1 and 2 show that for most investigated conditions, the RMSES and RMSEPtrans values of CAWS are lower than those of siPLS, which signifies that the former yields better
optimization than the latter. The data in category 2 in Table 1 were used as an example to explain the difference between the wavelengths selected by the two methods. Fig. 2 illustrates the wavelengths selected for the corn dataset using the siPLS and CAWS methods under different pretreatment conditions (the corresponding results are shown in groups 2–2, 2–3, 2–6, 2–7, 2–10, and 2–11 in Table 1). The shaded parts in Fig. 2 represent the segments selected by siPLS, whereas the solid and dotted lines correspond to the Corr curve and the optimal CAWS threshold, respectively. The wavelengths selected by CAWS were taken as those for which the |Ri| values are greater than or equal to the threshold. Based on the results depicted in Fig. 2, wavelengths in the range of 1520–1800 nm were selected by siPLS for all three pretreatment conditions. The CAWS optimized wavelengths also include this spectral window. It should be noted that the first overtone of CH stretching, a vibration that is closely related to oil content, occurs in the same wavelength range [43]. The variation in Corr curve profiles recorded under different pretreatment conditions indicates that the extent of agreement between the responses of the master and slave instruments strongly depends on the pretreatment method. In the absence of pretreatment, all |Ri| values on the Corr curve are above 0.999, except for those in the CAWS-eliminated spectral range of 1100–1300 nm. After SNV treatment, two troughs appear in the Corr curve near 1400 and 2140 nm. The wavelengths corresponding to both troughs were successfully removed by CAWS; however, the second trough was exactly included in the siPLS-optimized segments. Finally, with first derivative pretreatment, many troughs appear. Some were included in the
6
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
Table 2 Cross validation and predicted AOTF transfer results of the rice bran Bruker starch content detection model. Category
Group
Calibration transfera
Pretreatment
Wavelength selection
Spectral variables
RMSECV
RMSES
RMSEPtrans
Threshold
PLS factors
1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3
1–1 1–2 1–3 1–4 1–5 1–6 1–7 1–8 1–9 1–10 1–11 1–12 1-CK 2–1 2–2 2–3 2–4 2–5 2–6 2–7 2–8 2–9 2–10 2–11 2–12 2-CK 3–1 3–2 3–3 3–4 3–5 3–6 3–7 3–8 3–9 3–10 3–11 3–12 3-CK
SBC SBC SBC SBC SBC SBC SBC SBC SBC SBC SBC SBC SBC PDS PDS PDS PDS PDS PDS PDS PDS PDS PDS PDS PDS PDS SST7 SST7 SST8 SST2 SST7 SST8 SST4 SST4 SST8 SST4 SST8 SST8 SST8
None None None None SNV SNV SNV SNV 1st der 1st der 1st der 1st der None None None None None SNV SNV SNV SNV 1st der 1st der 1st der 1st der None None None None None SNV SNV SNV SNV 1st der 1st der 1st der 1st der None
None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS SiPLS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS SiPLS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS None CAWS SiPLS Si-CAWS SiPLS
1024 268 409 290 1024 775 409 376 1024 463 409 259 409 1024 695 409 135 1024 373 409 141 1024 752 409 351 409 1024 1024 409 293 1024 987 409 406 1024 1024 409 209 409
9.085 10.820 6.844 8.680 11.787 8.798 8.961 8.903 12.349 9.333 10.394 9.434 6.545 11.433 9.586 12.325 11.243 9.620 12.221 8.961 8.646 10.014 8.747 8.394 8.217 6.545 8.758 8.758 6.620 8.880 9.179 9.384 7.671 7.681 9.796 9.796 8.199 7.903 6.545
5.700 4.129 4.705 3.924 5.510 3.936 6.406 6.376 6.178 3.172 5.556 5.055 5.695 7.114 6.946 6.890 5.969 4.280 3.434 4.951 3.621 3.728 2.936 3.593 3.213 19.035 3.072 3.072 3.494 3.371 2.516 2.479 2.543 2.537 2.339 2.339 2.964 2.465 4.072
9.674 5.021 6.031 5.771 6.391
/ 0.943 / 0.837 / 0.845 / 0.385 / 0.781 / 0.918 / / 0.928 / 0.940 / 0.942 / 0.921 / 0.705 / 0.843 / / 0 / 0.830 / 0.358 / 0.035 / 0 / 0.931 /
5 10 14 15 2 8 4 4 2 10 3 4 12 3 6 3 3 8 8 4 8 9 9 7 10 12 7 7 11 15 7 7 7 7 7 7 11 9 12
8.179 8.168 7.675 6.645 6.966 6.345 6.658 9.472 9.917 8.598 5.989 5.228 5.116 6.273 5.134 4.931 4.446 5.146 28.322 5.420 5.420 6.519 5.220 5.576 6.632 6.312 6.238 4.365 6.042 5.002 6.933
Note: the lowest RMSES and RMSEPtrans values for each category are shown in bold black and red, respectively. a The subscripts in this column represent the number of SST factors.
segments optimized by siPLS, but with |Ri| b 0.846, none were kept by CAWS. 3.3. Calibration transfer results at different wavelengths 3.3.1. Transfer results of the corn oil NIR model at different wavelengths The calibration transfer results obtained for corn spectra pretreated by the first derivative method and transferred by the SST algorithm are presented in Fig. 3. Wavelength selections were performed using the CAWS and reversed-CAWS methods, at different threshold values. The CAWS threshold (T) was continuously increased from 0 by increments of 0.001, up to the point where the number of |Ri| values exceeding T on the Corr curve became less than or equal to the number of PLS factors (10). As for the reversed-CAWS threshold, it was consistently decreased from 1 by increments of 0.001, until the number of |Ri| greater than T was reduced to ≤10. In Fig. 3, the profiles of RMSES and RMSEPtrans values of CAWS are represented by solid and dotted blue lines, respectively (RMSES1 and RMSEP1trans), whereas the corresponding profiles determined using the reversed-CAWS method are depicted in solid and dotted red lines, respectively (RMSES2 and RMSEP2trans). As shown in Fig. 3, the calibration transfer results processed by CAWS and reversed-CAWS exhibit different trends with changing thresholds. For CAWS treatment, the number of spectral variables reached maximum (full spectrum) at 0 threshold, where the RMSES1 and RMSEP1trans values were 0.051 and 0.078, respectively. Increasing thresholds resulted in reduced numbers of spectral variables, with RMSES1 fluctuations within a 0.06 interval. The lowest RMSES1 value
was attained at a threshold of 0.874 (RMSES1 = 0.047, RMSEP1trans = 0.070, group 3–10 in Table 1). Comparatively, the reversed-CAWS treatment produced RMSES and RMSEPtrans values that generally decreased with increasing thresholds. The minimum number of spectral variables (10) appeared at the threshold of 0.644, with corresponding RMSES2 and RMSEP2trans values of 0.211 and 0.296, respectively. Beyond 0.644, the number of spectral variables increased with increasing thresholds, reaching a full spectrum and lowest RMSES2 at T = 1. This indicates that according to the reversed-CAWS method, the best calibration transfer results are achieved using the full spectrum. Considering that both, the RMSES and RMSEPtrans values of CAWS are lower than those of reversed-CAWS, irrespective of the threshold, the former is taken as a more suitable wavelength selection method. 3.3.2. Transfer results of the rice bran starch NIR model at different wavelengths The calibration results of rice bran spectra pretreated by the first derivative method and transferred by PDS are presented in Fig. 4. The CAWS and reversed-CAWS wavelength selection methods were applied to these spectra at different threshold values. To assess the influence of threshold on the number of CAWS-selected wavelengths, threshold values were increased by 0.001 increments from 0 up to X, where X is the first threshold at which the number of | Ri| N T was ≤9. Meanwhile, when using the reversed-CAWS method, threshold values were decreased from 1 by increments of 0.001, until the number of |Ri| N T became less than or equal to 9. The values of RMSES and RMSEPtrans were evaluated for both methods at every
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
7
Fig. 4 shows that CAWS treatment of the rice bran data yielded a maximum number of spectral variables (full spectrum) at 0 threshold, with corresponding RMSES3 and RMSEP3trans values of 3.728 and 4.931, respectively. With increasing thresholds, the number of spectral variables decreased, and the values of RMSES3 decreased then increased. The lowest RMSES3 (2.936) was achieved at a threshold of 0.705, for which RMSEP3trans is 4.132 (group 2–10 in Table 2). As for the reversed-CAWS treatment method, it resulted in generally decreasing profiles of RMSES and RMSEPtrans, with the minimum number of spectral variables (9) appearing at the threshold of 0.027. The values of RMSES4 and RMSEP4trans at this threshold are 13.031 and 20.287, respectively. Beyond 0.027, the number of spectral variables increased with increasing thresholds. The minimum value of RMSES4 (3.712) was attained at 0.993, with a corresponding RMSEP4trans value of 4.901. A comparison of the results depicted in Fig. 4 for the two wavelength selection methods shows that, although the optimization results of reversed-CAWS are slightly better than those of the full spectrum, CAWS is much more suitable for the analysis of the rice bran dataset. 3.4. Comparison of the results of the two datasets According to the results presented in Tables 1 and 2, the CAWS optimized models yield lower RMSES values than the full spectrum models, when using the same pretreatment method and calibration transfer algorithm. Similarly, the RMSES values of si-CAWS are less than those of siPLS. In general, RMSES and RMSEPtrans follow the same trend, meaning that when one decreases, the other does too. However, due to the influence of standard and CAWS standard samples, several models that produced comparatively lower RMSES had relatively higher RMSEPtrans. This indicates that the selection of standard and CAWS standard samples needs to be further optimized. Nevertheless, a comparison of all investigated groups shows that the CAWS optimized models consistently yielded the best transfer results, irrespective of the pretreatment, wavelength selection, and calibration transfer method used. The models of control groups, on the other hand, presented worse transfer results. Specifically, for both datasets, the models with the best calibration performance did not perform well in transferring. The CAWS optimized models, however, had the best transfer results, although their calibration performance was not the best. This is due to the fact that these models were specifically designed for the instrument being transferred. A comparison of the performances of the CAWS and siPLS wavelength selection methods (Fig. 2) shows that the former retains more spectral variables than the latter and is more effective in eliminating wavelengths with poor response consistency between instruments. Thus, CAWS is superior to siPLS. Similarly, compared to the reversed-CAWS method, CAWS yields better optimization results at different thresholds (Figs. 3 and 4), which signifies that it selects more suitable wavelengths for calibration transfer. 4. Conclusion
Fig. 2. Wavelengths selected by siPLS and CAWS based on the analysis of the corn dataset under conditions of PDS with (a) no, (b) SNV, or (c) first derivative pretreatment. Note: The shaded part in the figure represents the segments selected by siPLS. The solid line is the Corr curve, and the dotted line is the optimal CAWS threshold. The CAWS selected wavelengths are those at which the |Ri| values are greater than or equal to the threshold.
threshold. In Fig. 4, the solid and dotted blue lines represent the CAWS-determined profiles of RMSES3 and RMSEP3trans, respectively, whereas the RMSES4 and RMSEP4 trans profiles determined by reversed-CAWS are depicted in solid and dotted red lines, respectively.
A major problem encountered in the field of NIR instrument crosscalibration is that the optimal calibration model does not necessarily produce good transfer results, because it is not adjusted for the instrument being transferred. This necessitates the use of methods capable of selecting the wavelengths at which the responses of the master and slave instruments are highly similar. In addition, the selected wavelengths should include the spectral information of the target constituent. In this study, we propose a new wavelength selection method, called CAWS, that is highly efficient in identifying the wavelengths showing high correlation between the standard spectra of master and slave instruments, for the purpose of calibration transfer. To validate this method, it was applied to two available spectral datasets, corn and rice bran. The data were subjected to different pretreatment methods and analyzed by various calibration transfer
8
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
Fig. 3. Calibration transfer results of the corn dataset determined using the CAWS and reversed-CAWS wavelength selection methods, at different thresholds. Note: the corn dataset was pretreated by the first derivative method and transferred by SST. Abbreviations: RMSES1: root mean square error of CAWS standard samples after CAWS treatment; RMSEP1trans: root mean square error of prediction after CAWS treatment; RMSES2: root mean square error of CAWS standard samples after reversed-CAWS treatment; RMSEP2trans: root mean square error of prediction after reversed-CAWS treatment.
algorithms. In addition to CAWS, wavelengths were selected by other methods for comparison. The obtained results confirm the superiority of the calibration transfer results determined using the CAWS optimized models. The selection of wavelengths by CAWS occurs in two steps. First, the absolute values of Ri (coefficient of correlation between the responses of master and slave instruments) are calculated at every
wavelength (i), and the corresponding Corr curve is constructed. In the second step, the optimal threshold is determined by screening, and wavelengths greater than the optimal threshold are selected for subsequent modeling, transfer, and prediction. During screening, the threshold is systematically varied, resulting in the generation of models with different selected wavelengths. Each one of these models is then used to predict the transferred spectra of CAWS standard
Fig. 4. Calibration transfer results of the rice bran dataset determined using the CAWS and reversed-CAWS wavelength selection methods, at different thresholds. Note: the rice bran dataset was pretreated by the first derivative method and transferred by PDS. Abbreviations: RMSES3: root mean square error of CAWS standard samples after CAWS treatment; RMSEP3trans: root mean square error of the prediction after CAWS treatment; RMSES4: root mean square error of CAWS standard samples after reversed-CAWS treatment; RMSEP4trans: root mean square error of the prediction after reversed-CAWS treatment.
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
samples. The RMSES values are subsequently calculated, and the threshold yielding the least error is identified as the optimal threshold. In general, models with low RMSES are likely to show small RMSEPtrans, just as the models with low RMSECV show small RMSEP (root mean square errors of prediction). An important advantage of the proposed CAWS method is that it may be used in combination with other wavelength selection methods. In this study, we assess the performance of the si-CAWS method combining CAWS and siPLS. The results show that for similar pretreatment and calibration transfer conditions, si-CAWS is mostly superior to siPLS; however, it is inferior to CAWS. This indicates that the combined method is not particularly successful in selecting the wavelengths showing high response consistency between instruments and sufficient spectral information of the target constituent. Also, the optimization effect of si-CAWS is limited by relatively few spectral variables. Nevertheless, alternative combinations of CAWS with other methods may lead to improved wavelength selection efficiency, and thus, should be investigated further. In summary, to select the appropriate wavelengths, CAWS integrates the modeling and calibration transfer in one step. Instead of predetermining the wavelengths in the modeling step, the CAWS method developed herein adjusts the selected wavelengths according to the instruments being transferred, resulting in improved calibration transfer. CRediT authorship contribution statement Zhuopin Xu: Conceptualization, Methodology, Software, Formal analysis, Investigation, Validation, Writing - original draft, Writing - review & editing. Shuang Fan: Conceptualization, Methodology, Formal analysis, Investigation, Validation. Weimin Cheng: Conceptualization, Methodology. Jie Liu: Conceptualization, Methodology. Pengfei Zhang: Conceptualization, Methodology, Software. Yang Yang: Writing - review & editing. Cong Xu: Conceptualization. Binmei Liu: Funding acquisition, Resources. Jing Liu: Funding acquisition. Qi Wang: Supervision, Project administration, Funding acquisition. Yuejin Wu: Supervision, Project administration, Funding acquisition. Acknowledgement This work was financially supported by Anhui Provincial Key Research and Development Program (Grant No. 201904c03020007), the Science and Technology Service program of Chinese Academy of Sciences (Grant No. KFJ-STS-ZDTP-054), the Strategic Priority Research Chinese Academy of Sciences (Grant No. XDA08040107), and Anhui Science and Technology Major Project (Grant No. 18030701205). Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. References [1] C. Pasquini, Near infrared spectroscopy: a mature analytical technique with new perspectives - a review, Anal. Chim. Acta 1026 (2018) 8–36. [2] S. Grassi, C. Alamprese, Advances in NIR spectroscopy applied to process analytical technology in food industries, Curr. Opin. Food Sci. 22 (2018) 17–21. [3] R. Deidda, P.Y. Sacre, M. Clavaud, L. Coic, H. Avohou, P. Hubert, E. Ziemons, Vibrational spectroscopy in analysis of pharmaceuticals: critical review of innovative portable and handheld NIR and Raman spectrophotometers, Trac-Trend Anal Chem 114 (2019) 251–259. [4] A.M. Herrero, Raman spectroscopy a promising technique for quality assessment of meat and fish: a review, Food Chem. 107 (2008) 1642–1651. [5] B.M. Wise, R.T. Roginski, A calibration model maintenance roadmap, Ifac Papersonline 48 (2015) 260–265. [6] R.N. Feundale, N.A. Woody, H.W. Tan, A.J. Myles, S.D. Brown, J. Ferre, Transfer of multivariate calibration models: a review, Chemometr Intell Lab 64 (2002) 181–192.
9
[7] T. Fearn, Standardisation and calibration transfer for near infrared instruments: a review, J near Infrared Spec 9 (2001) 229–244. [8] J.J. Workman, A review of calibration transfer practices and instrument differences in spectroscopy, Appl. Spectrosc. 72 (2018) 340–365. [9] Y.Y. Shi, J.Y. Li, X.L. Chu, Progress and applications of multivariate calibration model transfer methods, Chinese J Anal Chem 47 (2019) 479–487. [10] D. Brouckaert, J.S. Uyttersprot, W. Broeckx, T. De Beer, Calibration transfer of a Raman spectroscopic quantification method for the assessment of liquid detergent compositions from at-line laboratory to in-line industrial scale, Talanta 179 (2018) 386–392. [11] V. Panchuk, D. Kirsanov, E. Oleneva, V. Semenov, A. Legin, Calibration transfer between different analytical methods, Talanta 170 (2017) 457–463. [12] J. Eliaerts, N. Meert, P. Dardenne, F. Van Durme, V. Baeten, N. Samyn, K. De Wael, Evaluation of a calibration transfer between a bench top and portable MidInfraRed spectrometer for cocaine classification and quantification, Talanta (2019) 120481. [13] Y.Y. Pu, D.W. Sun, C. Riccioli, M. Buccheri, M. Grassi, T.M.P. Cattaneo, A. Gowen, Calibration transfer from micro NIR spectrometer to hyperspectral imaging: a case study on predicting soluble solids content of Bananito fruit (Musa acuminata), Food Anal Method 11 (2018) 1021–1033. [14] E. Bouveresse, A.C. Hartmann, D.L. Massart, I.R. Last, K.A. Prebble, Standardization of near-infrared spectrometric instruments, Anal. Chem. 68 (1996) 982–990. [15] Y.D. Wang, D.J. Veltkamp, B.R. Kowalski, Multivariate instrument standardization, Anal. Chem. 63 (1991) 2750–2756. [16] Y.D. Wang, M.J. Lysaght, B.R. Kowalski, Improvement of multivariate calibration through instrument standardization, Anal. Chem. 64 (1992) 562–564. [17] W. Du, Z.P. Chen, L.J. Zhong, S.X. Wang, R.Q. Yu, A. Nordon, D. Littlejohn, M. Holden, Maintaining the predictive abilities of multivariate calibration models by spectral space transformation, Anal. Chim. Acta 690 (2011) 64–70. [18] Y. Liu, W.S. Cai, X.G. Shao, Standardization of near infrared spectra measured on multi-instrument, Anal. Chim. Acta 836 (2014) 18–23. [19] W. Fan, Y.Z. Liang, D.L. Yuan, J.J. Wang, Calibration model transfer for near-infrared spectra based on canonical correlation analysis, Anal. Chim. Acta 623 (2008) 22–29. [20] J.T. Peng, S.L. Peng, A. Jiang, J. Tan, Near-infrared calibration transfer based on spectral regression, Spectrochim. Acta A 78 (2011) 1315–1320. [21] L.L. Zhao, J.H. Li, W.J. Zhang, J.C. Wang, L.D. Zhang, Calibration transfer between two FTNIR spectrophotometers using SVR, Spectrosc Spect Anal 28 (2008) 2299–2303. [22] W.R. Chen, J. Bin, H.M. Lu, Z.M. Zhang, Y.Z. Liang, Calibration transfer via an extreme learning machine auto-encoder, Analyst 141 (2016) 1973–1980. [23] A. Folch-Fortuny, R. Vitale, O.E. Denoord, A. Ferrer, Calibration transfer between NIR spectrometers: new proposals and a comparative study, J. Chemom. 31 (2017) e2874. [24] C. Liang, H.F. Yuan, Z. Zhao, C.F. Song, J.J. Wang, A new multivariate calibration model transfer method of near-infrared spectral analysis, Chemom. Intell. Lab. Syst. 153 (2016) 51–57. [25] Z. Sun, J. Wang, L. Nie, L. Li, D. Cao, J. Fan, H. Wang, R. Liu, Y. Zhang, H. Zang, Calibration transfer of near infrared spectrometers for the assessment of plasma ethanol precipitation process, Chemometr Intell Lab 181 (2018) 64–71. [26] E. Bouveresse, D.L. Massart, P. Dardenne, Calibration transfer across near-infrared spectrometric instruments using Shenk’s algorithm: effects of different standardisation samples, Anal. Chim. Acta 297 (1994) 405–416. [27] Y. Liu, W.S. Cai, X.G. Shao, Linear model correction: a method for transferring a nearinfrared multivariate calibration model without standard samples, Spectrochim. Acta A Mol. Biomol. Spectrosc. 169 (2016) 197–201. [28] K.E. Kramer, R.E. Morris, S.L. Rose-Pehrsson, Comparison of two multiplicative signal correction strategies for calibration transfer without standards, Chemometr Intell Lab 92 (2008) 33–43. [29] B. Malli, A. Birlutiu, T. Natschläger, Standard-free calibration transfer - an evaluation of different techniques, Chemometr Intell Lab 161 (2017) 49–60. [30] Y.H. Yun, H.D. Li, B.C. Deng, D.S. Cao, An overview of variable selection methods in multivariate analysis of near-infrared spectra, Trac-Trend Anal Chem 113 (2019) 102–115. [31] R.M. Balabin, S.V. Smirnov, Variable selection in near-infrared spectroscopy: benchmarking of feature selection methods on biodiesel data, Anal. Chim. Acta 692 (2011) 63–72. [32] T. Mehmood, K.H. Liland, L. Snipen, S. Saebo, A review of variable selection methods in partial least squares regression, Chemometr Intell Lab 118 (2012) 62–69. [33] H. Swierenga, P.J. de Groot, A.P. de Weijer, M.W.J. Derksen, L.M.C. Buydens, Improvement of PLS model transferability by robust wavelength selection, Chemometr Intell Lab 41 (1998) 237–248. [34] X.Y. Zhang, Q.B. Li, G.J. Zhang, Calibration transfer without standards for spectral analysis based on stability competitive adaptive reweighted sampling, Spectrosc. Spectr. Anal. 5 (2014) 1429–1433. [35] L. Ni, M. Han, S. Luan, L. Zhang, Screening wavelengths with consistent and stable signals to realize calibration model transfer of near infrared spectra, Spectrochim. Acta A Mol. Biomol. Spectrosc. 206 (2019) 350–358. [36] L. Zhang, Y. Li, W. Huang, L. Ni, J. Ge, The method of calibration model transfer by optimizing wavelength combinations based on consistent and stable spectral signals, Spectrochim. Acta A Mol. Biomol. Spectrosc. 227 (2020) 117647. [37] Y. Mu, X. Liu, L. Wang, A Pearson’s correlation coefficient based decision tree and its parallel implementation, Inf. Sci. 435 (2018) 40–58. [38] P.S. Sampaio, A. Soares, A. Castanho, A.S. Almeida, J. Oliveira, C. Brites, Optimization of rice amylose determination by NIR-spectroscopy using PLS chemometrics algorithms, Food Chem. 242 (2018) 196–204. [39] Chinese National Standard GB 5006-1985, Determination of Crude Starch in Cereals Seeds (in Chinese), China Plan Publishing Company, Beijing, 1985 194–196.
10
Z. Xu et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 230 (2020) 118053
[40] K. Khodabux, M. Sophia, S. L’Omelette, S. Jhaumeer-Laulloo, P. Ramasami, P. Rondeau, Chemical and near-infrared determination of moisture, fat and protein in tuna fishes, Food Chem. 102 (2007) 669–675. [41] R.K.H. Galvao, M.C.U. Araujo, G.E. Jose, M.J.C. Pontes, E.C. Silva, T.C.B. Saldanha, A method for calibration and validation subset partitioning, Talanta 67 (2005) 736–740.
[42] Z. Xu, S. Fan, J. Liu, B. Liu, L. Tao, J. Wu, S. Hu, L. Zhao, Q. Wang, Y. Wu, A calibration transfer optimized single kernel near-infrared spectroscopic method, Spectrochim. Acta A Mol. Biomol. Spectrosc. 220 (2019) 117098. [43] W.F. McClure, D.L. Stanfield, Near-infrared spectroscopy of biomaterials, Handbook of Vibrational Spectroscopy, John Wiley & Sons Ltd, UK 2006, pp. 212–228.