Journal Pre-proof Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method
Zhenzhen Xia, Tian Yi, Yan Liu PII:
S1386-1425(19)31167-9
DOI:
https://doi.org/10.1016/j.saa.2019.117777
Reference:
SAA 117777
To appear in:
Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy
Received date:
14 August 2019
Revised date:
6 November 2019
Accepted date:
6 November 2019
Please cite this article as: Z. Xia, T. Yi and Y. Liu, Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy(2019), https://doi.org/10.1016/j.saa.2019.117777
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2019 Published by Elsevier.
Journal Pre-proof
Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method
Zhenzhen Xia1 Tian Yi1 Yan Liu2,3,4 Institute of Agricultural Quality Standards and Testing Technology Research, Hubei
of
1
Academy of Agricultural Science, Wuhan, 430064, China College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan
ro
2
Key Laboratory for Deep Processing of Major Grain and Oil (Wuhan Polytechnic
re
3
-p
430023, China
lP
University), Ministry of Education, College of Food Science and Engineering, Wuhan
4
na
Polytechnic University, Wuhan 430023, P.R. China Hubei Key Laboratory for Processing and Transformation of Agricultural Products
Jo ur
(Wuhan Polytechnic University), College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan 430023, P.R. China
Corresponding address: College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan, 430023, P. R. China Tel: +86-27-83924790
Corresponding author. 1
Journal Pre-proof
Fax: +86-27-83924790
Jo ur
na
lP
re
-p
ro
of
E-mail:
[email protected]
2
Journal Pre-proof
Abstract Sesame was one of the most important crops in Africa and east Asia. The sesamin and sesamolin in sesames have shown various pharmacological, biological and physiologic activities. In this study, a rapid and nondestructive method for determination of sesamin and sesamolin in Chinese sesames by near-infrared
of
spectroscopy coupled with chemometric method was proposed. The near infrared spectra of sesame samples from three different Chinese areas were collected and the
ro
partial least squares (PLS) was used to construct the quantitative models. The spectral
-p
preprocessing and variable selection methods were adopted to improve the
re
predictability and stability of the model. Reasonable quantitative results can be
lP
obtained when the samples used for model construction and prediction were harvested
na
in same years. For sesamin and sesamolin, the correlation coefficient (R) and root mean square error prediction (RMSEP) were 0.9754, 0.9636 and 151.2951, 39.7720,
Jo ur
respectively. The optimized models seem less effective when they were used to predict the samples harvested in other years or countries. However, acceptable results can still be obtained.
Key words: near-infrared spectroscopy; sesame; sesamin; sesamolin; partial least squares.
3
Journal Pre-proof
1. Introduction Sesame (Sesamum indicum L.) has been widely consumed as a traditional food and oil crop in Africa and east Asia for thousands of years [1]. Sesame contains various active constituents, while the sesamin and sesamolin were the two typical and major lignans which play significant roles in seed physiology. First discovered and
of
isolated in 1950’s, the two components have shown strong bioactivities, pharmacological and physiologic activities by various researches [2, 3]. For instances,
ro
Suja et al. reported the free radical scavenging behavior of sesamin and sesamolin in
-p
DPPH system [4]. Yokota discovered that lignans in the sesame could down-regulates
re
cyclin D1 protein expression in human tumor cells. The neuroprotective effects of
lP
sesamin and sesamolin on gerbil brain in cerebral ischemia [5] was reported by Cheng
na
et al. [6]. Other activities, such as increasing hepatic fatty acid oxidation enzymes [7], enhancing antioxidant activity of vitamin E in lipid peroxidation systems [8],
reported.
Jo ur
lowering cholesterol levels [9] and antihypertensive effects [10, 11] were also
As the potentials in health promotion, the two lignans have been used as active ingredients in antioxidants, antiseptics, bactericides, viricides, disinfectants, moth repellants and anti-tubercular agents [12]. Therefore, it is essential and significant to determine the sesamin and sesamolin in sesame seeds. The chromatographic methods were the most widely used analysis method for the determination of the sesamin and sesamolin and proved to be accurate and stable in actual applications. Various chromatographic methods, including thin-layer chromatography (TLC) [13, 14], gas 4
Journal Pre-proof
chromatography (GC) [15], normal-phase liquid chromatography (NPLC) [16], reversed phase liquid chromatography (RPLC) [17, 18] and high performance liquid chromatography (HPLC) [19-21] has been adopted to analysis the two lignans in sesame all over the world. The chromatographic methods, however, always suffer from
the
sophisticated
pretreatment,
time-consuming
procedures
and
of
environment-hazardous reagents. These shortages restrict the large-scale applications of the chromatographic methods in rapid determination. Thus, to satisfy the need of
ro
rapid determination, developing simple, timesaving and non-destructive analysis
-p
method for the sesamin and sesamolin determination is inevitable.
re
As a non-destructive, rapid, eco-friendly and low-costing analytical technique,
lP
near-infrared spectroscopy has been applied in various fields with large-scale [22-24].
na
For food analysis, large number of applications, including meat [25], vegetable [26], fruit [27, 28], fish [29, 30], et al., have been reported. Generally, the near infrared
Jo ur
spectrum is absorbance weakness and peak broadness. Therefore, a reliable calibration model has to be used to get the quantitative or qualitative information from the corresponding spectrum. Support vector regression (SVR), principal component regression (PCR) and partial least squares (PLS) and artificial neural network (ANN) were the most commonly used calibration methods. In some situations, the multiplicative scattering, the background shift or the spectral redundancy in the spectra may reduce the accuracy and stability of the model. To optimize the calibration models, several chemometric methods, including the spectral pretreatment [31-33], outlier detection [34, 35], variable selection [36-42], and ensemble modeling 5
Journal Pre-proof
[43-45], et al., can be used to improve the models. The possibility for determination of the sesamin and sesamolin has been investigated before [46]. The improvements, however, are still needed. Thus, the possibility for accurate, and rapid non-destructive determination of the sesamin and sesamolin in Chinese sesames using near-infrared spectroscopy coupling with
of
chemometric method was investigated in this study. Partial least squares (PLS) was used to construct the near infrared quantitative analysis model. To optimize the PLS
ro
models, 1st derivative, continuous wavelet transform (CWT), multiplicative scatter
-p
correction (MSC) and standard normal variate (SNV) were used to preprocess the
re
spectra. Moreover, three wavelength selection methods, the Monte Carlo
lP
uninformative variable elimination (MC-UVE) [39] the randomization test (RT) [40],
na
and the competitive adaptive reweighted sampling (CARS) [38] were adopted to select the useful variables and further improve the predictability and stability of the
Jo ur
model. To evaluate to proposed method, the concentrations of sesamin and sesamolin obtained by the method and HPLC analysis were compared. 2. Experimental and calculations 2.1 Reagents and samples Sesamin (purity > 98%) and sesamolin (purity > 95%) were purchased from aladdin (China). Methyl alcohol were obtained from Tianjin Chemical Reagent Corporation (China). A total number of 743 sesame seeds were collected. These samples belong to four different datasets. The first three datasets contain samples harvested from 2018, 2017 and 2016, respectively. All the samples in these three 6
Journal Pre-proof
datasets come from three Chinese areas (middle reaches of Changjiang River, Northeast China and Huang-Huai Valleys). The middle reaches of Changjiang River include Hubei, Hunan, Jiangxi and south part of Anhui province. The Northeast China include Heilongjiang, Jilin and Liaoning province and the Huang-Huai Valleys include Henan, south part of Shandong and north part of Anhui province. Additionally,
of
dataset four contains 151 samples collected from East Africa (Ethiopia and Sudan).
re
-p
Table 1
ro
Table 1 shows the detail information of the samples in each dataset.
lP
2.2 HPLC analysis
na
The reference values of sasamin and sesamolin in sesame samples were analyzed by the HPLC method. The extraction procedures of Ref. [20] was adopted to extract
Jo ur
the two lignans from the samples. Then the extractives were analyzed by HPLC according to Ref. [21]. The analysis results were also listed in Table 1. 2.3 Spectral measurement
A NIR spectrometer in diffuse reflection mode (Antaris Ⅱ, Thermo Fisher, USA) was adopted to collect the spectra. Each spectrum ranges from 3999.6 cm-1 to 9999.1 cm-1 and consists of 3112 data points with the digitization interval ca. 1.93 cm -1. The scan number was 64 and the measurement was performed at room temperature. 2.4 Calculations The 193 samples in dataset 1 were divided into a calibration and a prediction set 7
Journal Pre-proof
randomly. The calibration set contains 150 samples and the spectra of the corresponding samples were processed by four spectral preprocessing methods (1st derivative, CWT, MSC and SNV). Then the raw and preprocessed spectra were used to construct different PLS models, respectively. The model with the best cross validation results, would be further optimized by MC-UVE, RT and CARS methods.
of
The prediction set, which contains 43 samples, was adopted to evaluate the optimized models. The concentrations of sesamin and sesamolin in samples of prediction set
ro
were calculated by optimized model and the compared with the results obtained by
-p
HPLC method. Moreover, For the sesame seeds harvested from different years or
re
from different countries, samples in dataset 2, 3 and 4 were also used the evaluate
lP
optimized models. Four parameters, the correlation coefficient (R), root mean squared
na
error of cross validation (RMSECV), root mean squared error of prediction (RMSEP) and ratio of prediction to deviation (RPD), were adopted as criterions for the
Jo ur
evaluation of the models.
3. Result and discussion
3.1 Determination of latent variable The raw spectra, and the spectra preprocessed by 1st, CWT, MSC and SNV were used to construct the PLS models of sesamin and sesamolin, respectively. The latent variable is the key parameter in the construction procedures of partial least square. In this study, Monte Carlo Cross Validation (MCCV) [47] was adopted to determine the number of the latent variables for the different PLS models. This method determines the number of the latent variable by calculating the predicted residual sum of squares 8
Journal Pre-proof
(PRESS) of a PLS model. When an appropriate latent variable was chosen, the PRESS should be small enough. For the PLS models of sesamin and sesamolin, the variations of PRESS along with the increasing of latent variable were plotted in Fig. 1(a) and (b), respectively. For the models of sesamin, it can be seen from Fig. 1(a) that the variation trend
of
of PRESSs were similar. The values were large when the latent variable is small, and then decrease along with the increasing of the latent variable. Finally, the values will
ro
fluctuate in a narrow scope and for different models, the fluctuation scopes are
-p
slightly different. For instance, the PRESS of the model constructed by raw spectra
re
(blue line), decrease from 4.6107 to 1.6107 when the latent variable varies from one
lP
to seven. Then the value undulates slightly around 1.4107 when the latent variable
na
was larger than 10. Therefore, the latent variables for the PLS models of raw spectra and the spectra preprocessed with 1st, CWT, MSC and SNV were 12, 12, 13, 14 and
Jo ur
14, respectively. As to the models of sesamolin, similar situations were also observed in Fig. 2(b). Thus, the latent variables for the five PLS models are 14, 15, 15, 16 and 16.
Figure 1
3.2 Effect of spectral pretreatment Previous researches have shown the effects of spectral pretreatment to the PLS models [22-24, 31-33]. Therefore, the raw spectra were preprocessed by 1st derivative, 9
Journal Pre-proof
CWT, MSC and SNV, and then adopted to construct the PLS model. The cross validation was adopted to evaluate the models and the evaluation results were shown in Table 2. For comparison, the evaluation results of the model constructed by the raw spectra were also calculated. Clearly, the models constructed by the preprocessed spectra show higher predictability and accuracy. For sesamin, the RMSECV, R
of
(calibration set) and RPD of the model constructed by raw spectra were 310.5864, 0.8750 and 2.2991. As to the models constructed by the spectra preprocessed with 1st
ro
and CWT, the RMSECV decrease to 216.8498 and 211.9511, meanwhile the R
-p
(calibration set) and RPD increase to 0.9592, 0.9625 and 3.4213, 3.5485, respectively.
re
Besides, MSC and SNV also show the improvements in predictability and accuracy of
lP
PLS models with smaller RMSECVs and higher Rs (calibration set) and RPDs, but
na
the improvements seem less efficiency than 1st and CWT. In detail comparison, CWT is slightly better than 1st derivative. For sesamolin, the improvements of spectral
Jo ur
preprocessing methods to the predictability and accuracy of PLS models can be also observed and CWT have the best optimizing results. Thus, for both sesamin and sesamolin, CWT was used to preprocess the spectra of the sesame seeds.
Table 2
3.3 Effect of variable selection To further optimize the PLS model, MC-UVE, RT and CARS methods were used to select the useful variables from the spectra preprocessed by CWT. Fig. 2(a) and (b) 10
Journal Pre-proof
plot the selected variables of sesamin and sesamolin, respectively. The mean spectrum preprocessed by CWT were also plotted as comparison. For sesamin, the selected variables located in 4000-5200 cm-1 are considered to be the combination of CC stretching vibrations and CH stretching vibrations [48]. The selected variables around 6000 cm-1 are assigned to be the first overtone of CH stretching vibrations. As for the
of
selected variables around 8000-9000 cm-1 and 10000 cm-1, the two regions were generally represented the second and third overtone of CH stretching vibrations,
ro
respectively [49]. On the other hand, the distributions of the variables selected by
-p
three methods are similar. In detail comparison, the results of MC-UVE and RT
re
method seem more similar and the CARS method selected less variables than other
lP
two methods. For sesamolin, it can be seen from Fig. 2(b) the distributions of the
na
selected variables are similar to the results plotted in Fig. 2(a), which may be explained by the fact that the structures of the two compounds are close. But the
Jo ur
selected variables in each region may be a little different due to the little differences in the structures. For instance, in the wavenumber regions of the combination of CC stretching vibrations and CH stretching vibrations, all the three wavelength selection methods have selected variables for both sesamin and sesamolin. But the selected variables in Fig. 2(a) and (b) are different.
Figure 2
Table 3 lists the numbers of selected variables using this MC-UVE, RT and 11
Journal Pre-proof
CARS methods. Identically, the RMSECV, R (calibration set) and RPD of the PLS models optimized with CWT coupling with three variable selection methods are also shown in the table. The efficiency of the variable selection methods can be clearly seen by comparing the three parameters in Table 2 and 3. Taking sesamin as an example, the RMSECV, R (calibration set) and RPD of the PLS model constructed by
of
the spectra preprocessing with CWT in Table 2 are 211.9511, 0.9625 and 3.5485, respectively. When 339 variables were selected by MC-UVE method, the three
ro
parameters vary to 141.5915, 0.9728 and 4.0697. Similarly, the three parameters could
-p
change to 133.9682, 0.9802 and 4.7667 when 335 variables were selected by RT
re
method. Moreover, the CARS method selected 63 method and the three parameters
lP
vary to 138.6546, 0.9742 and 4.1897, respectively. For sesamolin, the improvements
na
are also obtained. In detail comparison, RT method seems more efficiency for sesamin,
Jo ur
and CARS method seems more efficiency for sesamolin with better prediction results.
Table 3
3.4 Validation studies of optimized models The samples in prediction set were adopted to evaluate the optimized PLS models by calculating the concentrations of the two lignans and then compare with the results obtained by HPLC method. The RMSEPs, Rs (prediction set) and RPDs of the optimized models were listed in Table 2 and 3. As comparison, the results of the PLS model constructed by the raw spectra were also shown in Table 2. For both sesamin 12
Journal Pre-proof
and sesamolin, the improvements of the PLS models in predictability and accuracy can be observed clearly by comparing the parameters in the two tables. Taking sesamin as an example, the RMSEP, R (prediction set) and RPD of the PLS model constructed by the raw spectra are 348.4468, 0.8754 and 2.0746. When CWT method was used to preprocess the spectra, the three parameters vary to 224.7808, 0.9447 and
of
3.0401. When 335 variables were selected by RT method, the three parameters further vary to 151.2951, 0.9754 and 4.5508, respectively. On the other hand, for most PLS
ro
models, the values of RMSECVs and RMSEPs, the values of Rs in calibration and
-p
prediction set, and the values of RPDs in calibration and prediction set are close. The
re
combination of CWT and RT method seems to have the best optimization results for
lP
the samples in calibration set. The PLS model optimized by this combination also
na
have the best prediction results for the samples in prediction set. As for sesamolin, the improvements are similar, and the combination of CWT and CARS method seems to
Jo ur
have better prediction results than other combinations. Fig. 3(a) and (b) plot the scatters of sesamin and sesamolin contents obtained by the two optimized methods, respectively. In each subgraph, the red and green points represent the calculation results of the samples in calibration and prediction set. Two fitting lines (red and green) and a diagonal line (dark) were also plotted. For sesamin, it was found in Fig. 3(a) that the three lines are almost overlapped, and most red and green points are reasonably distributed along the three lines. A few samples locate a little far from the three lines. As for sesamolin, similar results were obtained and the red and green lines seem more closer to the dark line comparing with the two lines in 13
Journal Pre-proof
Fig. 3(a).
Figure 3
The prediction results for the samples in prediction set indicated that the
of
optimized models can correctly calculate the lignans in the samples harvested in the same year and from same area. To further evaluate the optimized models, the
ro
optimized PLS models were adopted to predict the samples harvested in different
-p
years (dataset 2 and 3), or from different areas (dataset 4). Table 4 listed the prediction
re
results of the three datasets. The prediction results obtained by other combinations
lP
were also listed as comparison. Obviously, for the samples harvested in 2017 (dataset
na
2), when the combination of CWT and RT method were adopted to predict the sesamin, the RMSEP, R and RPD were 170.0418, 0.9660 and 3.5984, respectively. As
Jo ur
to the samples in dataset 3, the three parameters vary to 186.2456, 0.9565 and 3.3677. Comparing with the prediction results in Table 3, the optimized models seem less effective with larger RMSEP and smaller R, RPD. As 2017 is closer to 2018 than 2016, the results of dataset 2 was slightly better. For sesamolin, the prediction results were similar. Moreover, when the optimized model (CWT+CARS) were used to predict lignans of samples from other areas, the results seem more worse. For sesamin, the RMSEP, R and RPD were 191.6077, 0.9472 and 2.9535, and for sesamolin, the three parameters were 67.8281, 0.8737 and 2.0315, respectively.
14
Journal Pre-proof
Table 4
Fig. 4(a) and (b) plot the scatters of sesamin and sesamolin contents obtained by the two optimized methods, respectively. Three fitting lines (olive, orange and navy-blue) and a diagonal line (dark) were also plotted. It can be seen in each
of
subgraph that most olive, orange and navy-blue points are reasonably distributed along the three lines. But compared with the points in Fig. 3(a) and (b), some points
ro
seem a little far from the lines, which indicated larger prediction errors of the
-p
optimized models. Moreover, the three lines in each subgraph seem a little far from
re
the dark line. Thus, systematic error may exist when the optimized models were used
lP
to predict samples in dataset 2, 3 and 4. Table 4 and Fig. 4 indicated that the
na
optimized models became less effective when the they were used to predict the samples harvested in different year or from other areas. As most correlation
Jo ur
coefficients (R) were larger than 0.90, the prediction results were still reluctant.
Figure 4
4. Conclusion The possibility for accurate rapid and non-destructive determination of sesamin and sesamolin in Chinese sesame seeds by near infrared spectroscopy coupling with chemometric methods was investigated in this study. The partial least squares method was used to construct calibration models. The spectral preprocessing methods, 15
Journal Pre-proof
including 1st, CWT, MSC, SNV, and variable selection techniques, including MC-UVE, RT and CARS methods, were adopted to optimize the PLS models. The CWT method seems to have the best preprocessing results, and for variable selection techniques, the RT and CARS method seems more efficiency for sesamin and sesamolin with better prediction results, respectively. The optimized PLS model were
of
used to calculate the concentrations of sesamin and sesamolin in prediction samples, and the calculated results of sesamin and sesamolin are close to the concentrations
ro
obtained by HPLC analysis. When the optimized PLS model were used to predict
-p
samples harvested in different years or from other areas, reluctant results were also
re
obtained. Thus, the proposed method provides an alternative choice for fast and
na
5. Acknowledgements
lP
non-destructive analysis of lignans in sesame.
This study was supported by The National Key Research and Development
Jo ur
Program of China (2017YFC1600500), Open Foundation of Key Laboratory of Deep Processing of Major Grain and Oil (Wuhan Polytechnic University), Ministry of Education (2018JYBQGDKFB01) and National Natural Science Foundation of China (No. 31701596). References [1] K.R. Anilakumar, A. Pal, F. Khanum, A.S. Bawa, Agric. Conspec. Sci. 75 (2010) 159–168. [2]
P. Budowski, K.S. Markley, Chem. Rev. 48 (1951) 125–151.
[3]
P. Budowski, J. Am. Oil Chem. Soc. 41 (1964) 280–285. 16
Journal Pre-proof
[4]
K.P. Suja, A. Jayalekshmy, C. Arumughan, J. Agr. Food Chem. 2004 (52) 912– 915.
[5]
T. Yokota, Y. Matsuzaki, M. Koyama, T. Hitomi, M. Kawanaka, M. Enoki-Konishi, Y. Okuyama, J. Yakayasu, H. Nishino, A. Nishikawa, T. Osawa, T. Sakai, Cancer Sci. 2007 (98) 1447–1453. F.C. Cheng, T.R. Jinn, R.C. Hou, J.T.C. Tzen, Int. J. Biomed. Sci. 2006 (2) 284–
of
[6]
288.
L. Ashakumary, I. Rouyer, Y. Takahashi, T. Ide, N. Fukuda, T. Aoyama, T.
ro
[7]
Ghafoorunissa, S. Hemalatha, M.V.V. Rao, Mol. Cell. Biochem. 2004 (262)
re
[8]
-p
Hashimoto, M. Mizugaki, M. Sugano, Metabolism 1999 (48) 1303–1313.
N.P. Visavadiya, A.V.R.L. Narasimhacharya, Food Chem. Toxicol. 2008 (46) 1889–1895.
na
[9]
lP
195–202.
Jo ur
[10] C.C. Lee, P.R. Chen, S. Lin, S.C. Tsai, B.W. Wang, W.W. Chen, C. Tsai, K.G. Shyu, J. Hypertens. 2004 (22) 2329–2338. [11] D. Nakano, D. Kurumazuka, Y. Nagai, A. Nishiyama, Y. Kiso, Y. Matsumura, Clin. Exp. Pharmacol. P. 2008 (35) 324–326. [12] D. Bedigian, D.S. Seigler, J.R. Harlan, Biochem. Syst. Ecol. 1985 (13) 133– 139. [13] T. Fukuda, M. Nagata, T. Osawa, M. Namiki, J. Am. Oil Chem. Soc. 1986 (63) 1027–1031. [14] A. Kamal-Eldin, G. Yousif, L.A. Appelqvist, J. Am. Oil Chem. Soc. 1991 (68) 17
Journal Pre-proof
844–847. [15] S. Hammann, M. Englert, M. Muller, W. Vetter, Anal. Bioanal. Chem. 2015 (407) 9019–9028. [16] P. Gornas, A. Siger, I. Pugajeva, D. Segliņa, Food Addit. Contam. 2014 (31) 567–573.
of
[17] R. Wu, F. Ma, L. Zhang, P. Li, G. Li, Q. Zhang, W. Zhang, X. Wang, Food Chem. 2016 (204) 334–342.
ro
[18] M. Takahashi, Y. Nishizaki, N. Sugimoto, H. Takeuchi, K. Nakagawa, H.
-p
Akiyama, K. Sato, K. Inoue, J. Sep. Sci. 2016 (39) 3898–3905.
re
[19] A. Kamal-Eldin, L.A. Appelqvist, J. Am. Oil Chem. Soc. 1994 (71) 149–156.
lP
[20] S. Hemalatha, Ghafoorunissa, J. Am Oil Chem. Soc 2004 (81) 467–470
na
[21] N. Rangkadilok, N. Pholphana, C. Mahidol, W. Wongyai, K. Saengsooksree, S. Nookabkaew, J. Satayavivad, Food Chem. 2010 (122) 724–730.
Jo ur
[22] Y. Roggo, P. Chalus, L. Maurer, C. Lema-Martinez, A, Edmond, N. Jent, J. Pharmaceut. Biomed. 2007 (44) 683–700. [23] H. Huang, H. Yu, H. Xu, Y. Ying, J. Food Eng. 2008 (87) 303–313. [24] N. Prieto, R. Roehe, P. Lavin, G. Batten, S. Andres, Meat Sci. 2009 (83) 175– 186. [25] N. Prieto, O. Pawluczyk, M.E.R. Dugan, J.L. Aalhus, Appl. Spectrosc. 2017(71) 1403–1426. [26] R. Moscetti, R.P. Haff, D. Monarca, M. Cecchini, R. Massantini, Postharvest Biol. Tec. 2016 (120) 204–212. 18
Journal Pre-proof
[27] Y. Guo, Y.N. Ni, S. Kokot, Spectrochim. Acta A 2016 (153) 79–86. [28] R. Beghi, V. Giovenzana, A. Tugnolo, R. Guidetti, J. Sci. Food Agr. 2018 (98) 2729–2734. [29] X. Huang, H. Xu, L. Wu, H. Dai, L. Yao, F. Han, Anal. Methods, 2016 (8) 2929–2935.
of
[30] J.H. Cheng, D.W. Sun, LWT-Food Sci. Technol. 2015 (62) 1060–1068. [31] X.G. Shao, A.K.M. Leung, F.T. Chau, Acc. Chem. Res. 36 (2003) 276–283.
ro
[32] Z.P. Chen, J. Morris, E. Martin, Anal. Chem. 78 (2006) 7674–7681.
-p
[33] A. Rinnan, F. van den Berg, S.B. Engelsen, TrAC Trends Anal. Chem. 28 (2009)
re
1201–1222.
lP
[34] Z.C. Liu, W.S. Cai, X.G. Shao, Sci. China Chem. 51 (2008) 751–759.
2841–2847.
na
[35] X.H. Bian, W.S. Cai, X.G. Shao, D. Chen, E.R. Grant, Analyst 135 (2010)
Jo ur
[36] Y.H. Yun, W.T. Wang, M.L. Tan, Y.Z. Liang, H.D. Li, D.S. Cao, H.M. Lu, Q.S. Xu, Anal. Chim. Acta 807 (2014) 36–43. [37] Q.J. Han, H.L. Wu, C.B. Cai, L. Xu, R.Q. Yu, Anal. Chim. Acta 612 (2008) 121–125. [38] H.D. Li, Y.Z. Liang, Q.S. Xu, D.S. Cao, Anal. Chim. Acta 648 (2009) 77–84. [39] W.S. Cai, Y.K. Li, X.G. Shao, Chemom. Intell. Lab. Syst. 90 (2008) 188–194. [40] H. Xu, Z.C. Liu, W.S. Cai, X.G. Shao, Chemom. Intell. Lab. Syst. 97 (2009) 189–193. [41] L. Norgaard, A. Saudland, J. Wagner, J.P. Nielsen, L. Munck, S.B. Engelsen, 19
Journal Pre-proof
Appl. Spectrosc. 54 (2000) 413–419. [42] J.H. Jiang, R.J. Berry, H.W. Siesler, Y. Ozaki, Anal. Chem. 74 (2002) 3555– 3565. [43] H.D. Li, Y.Z. Liang, D.S. Cao, Q.S. Xu, TrAC Trends Anal. Chem. 38 (2012) 154–162.
of
[44] M. Jing, W.S. Cai, X.G. Shao, Chemom. Intell. Lab. Syst. 100 (2010) 22–27. [45] X.G. Shao, X.H. Bian, W.S. Cai, Anal. Chim. Acta 666 (2010) 32–37.
ro
[46] K.S. Kim, S.H. Park, M.G. Choung, J. Agric. Food Chem. 54 (2006) 4544–
-p
4550.
re
[47] Q.S. Xu, Y.Z. Liang, Chemom. Intell. Lab. Syst. 56 (2001) 1–11.
lP
[48] W. Kaye, Spectrochim. Acta 6(1954) 257–287.
na
[49] D. Bassi, L. Menegotti, S. Oss, M. Scotini, F. Iachello, Chem. Phys. Lett.
Jo ur
(207)1993 167–172.
20
Journal Pre-proof
Table 1 Detail information of samples in four datasets Dataset
1
2
3
Area middle reaches of Changjiang River Northeast China Huang-Huai Valleys middle reaches of Changjiang River Northeast China Huang-Huai Valleys middle reaches of Changjiang River Northeast China Huang-Huai Valleys
Sesamin Mean Range (mg/kg) (mg/kg)
Sesamolin Mean Range (mg/kg) (mg/kg)
Harvested Year
Number of Samples
2018
64
501.5350-2338.7308 1181.5579 442.1108
2018
63
564.2793-5098.0734 1535.9661 989.7029 323.1102-1648.8824 614.5374 281.8037
2018
66
448.9690-2188.6924 1212.3636 410.7286
2017
66
526.6134-2164.4793 1158.3958 327.0240
2017
70
2017
e
r P
f o
o r p
n r u
o J
386.2513-840.1394
507.2173
SDa (mg/kg) 88.3933
354.2153-530.0388
530.0388 112.7894
379.8163-519.1792
519.1792
80.5247
565.9551-4832.9337 1591.7395 805.4976 370.8917-1232.9687 615.0026 211.0706 570.5511-1908.5324 1205.4512 329.4276
322.0535-739.9504
536.8959
90.3521
66
469.5978-2003.0924 1098.0744 323.7285
339.9873-734.2633
516.8574
81.5316
2016
63
608.4513-4048.5795 1461.4115 742.4133 340.2382-1323.3634 592.9069 194.8539
2016
65
675.9492-1783.8225 1204.5685 291.3762
2016
69
l a
SD (mg/kg)
21
340.2496-770.0891
523.8376
88.0517
Journal Pre-proof
4 a
Ethiopia/Sudan Unknown
151
528.0992-3558.1668 1342.8871
565.905
320.731-1112.2972
Standard deviation
f o
l a
e
o r p
r P
n r u
o J
22
549.8443 137.7909
Journal Pre-proof
Table 2 Prediction results of PLS models optimized by different spectral preprocessing methods.
of
Prediction Set RMSEP R RPD 348.4468 0.8754 2.0746 233.1341 0.9419 2.9714 224.7808 0.9447 3.0401 301.5235 0.8818 2.3904 372.0474 0.9018 2.1439 102.0307 0.8132 1.7231 70.2065 0.8866 2.5041 68.3724 0.8912 2.5713 99.8272 0.8500 1.5656 94.0381 0.8529 1.9197
ro
-p re lP
Sesamolin
Raw 1st CWT MSC SNV Raw 1st CWT MSC SNV
Calibration Set RMSECV R RPD 310.5864 0.8750 2.2891 216.8498 0.9592 3.4213 211.9511 0.9625 3.5485 307.9477 0.8876 2.2795 295.8644 0.9108 2.1779 98.8007 0.8383 1.4798 56.4289 0.9274 2.7282 53.0297 0.9313 2.8795 96.8797 0.8448 1.9044 92.3154 0.8680 2.2597
na
Sesamin
Method
Jo ur
Lignan
23
Journal Pre-proof
Table 3 Prediction results of PLS models optimized by CWT method coupling with two variable selection methods Lignan
Sesamin
Sesamolin
Methods CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CET+CARS
Number of Variables 339 335 63 205 266 115
Calibration set RMSECV R 141.5915 0.9728 133.9682 0.9802 138.6546 0.9742 50.0668 0.945 46.4240 0.9581 39.6023 0.9742
l a
n r u
o J
24
RMSEP 179.0655 151.2951 167.3339 59.8384 53.7935 39.7720
f o
o r p
e
r P
RPD 4.0697 4.7667 4.1897 3.3725 3.7156 3.9392
Prediction set R 0.9571 0.9754 0.9605 0.9362 0.9486 0.9636
RPD 3.7035 4.5508 3.8330 2.8505 3.3675 3.7043
Journal Pre-proof
Table 4 Prediction results of dataset 2, 3 and 4
Sesamolin
Sesamin Dataset 3 Sesamolin
Sesamin Dataset 4
Jo ur
na
lP
Sesamolin
25
R 0.9559 0.9660 0.9637 0.9147 0.9177 0.9285 0.9471 0.9565 0.9493 0.9032 0.9033 0.9183 0.9470 0.9481 0.9472 0.8582 0.8615 0.8737
of
Dataset 2
RMSEP 185.8253 170.0418 182.0153 58.2094 58.1769 55.5021 195.9905 186.2456 190.3762 61.6743 58.8528 55.3507 194.3238 191.1124 191.6077 71.2143 70.4654 67.8281
ro
Sesamin
Method CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS
-p
Lignan
re
Dataset
RPD 3.3941 3.5984 3.4597 2.4566 2.4978 2.6558 3.1462 3.3677 3.0816 2.3161 2.3174 2.5062 2.9122 2.9611 2.9535 1.9349 1.9554 2.0315
Journal Pre-proof
Figure captions: Fig. 1 Variations of PRESS along with latent variable number Fig. 2 Selected variables of spectra preprocessed with CWT using MC-UVE, RT and CARS methods Fig. 3 Scatter plots of sesamin (a) and sesamolin (b) contents obtained by HPLC
of
method and optimized models (dataset one) Fig. 4 Scatter plots of sesamin (a) and sesamolin (b) contents obtained by HPLC
Jo ur
na
lP
re
-p
ro
method and optimized models (dataset two, three and four)
26
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Figure 1
27
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Figure 2
28
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Figure 3
29
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Figure 4
30
Journal Pre-proof
Declaration of intesest statement The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a
Jo ur
na
lP
re
-p
ro
of
conflict of interest in connection with the work submitted.
31
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Graphical abstract
32
Journal Pre-proof
Highlights 1. Sesamin and sesamolin in sesame from three Chinese provinces were calculated by NIR and PLS. 2. Spectral pretreatment and variable selection method were adopted to optimize the model.
of
3. Prediction results of sesamin and sesamolin by optimized model and HPLC
Jo ur
na
lP
re
-p
ro
method were close.
33