Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method

Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method

Journal Pre-proof Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemo...

780KB Sizes 0 Downloads 26 Views

Journal Pre-proof Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method

Zhenzhen Xia, Tian Yi, Yan Liu PII:

S1386-1425(19)31167-9

DOI:

https://doi.org/10.1016/j.saa.2019.117777

Reference:

SAA 117777

To appear in:

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy

Received date:

14 August 2019

Revised date:

6 November 2019

Accepted date:

6 November 2019

Please cite this article as: Z. Xia, T. Yi and Y. Liu, Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy(2019), https://doi.org/10.1016/j.saa.2019.117777

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier.

Journal Pre-proof

Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method

Zhenzhen Xia1 Tian Yi1 Yan Liu2,3,4 Institute of Agricultural Quality Standards and Testing Technology Research, Hubei

of

1

Academy of Agricultural Science, Wuhan, 430064, China College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan

ro

2

Key Laboratory for Deep Processing of Major Grain and Oil (Wuhan Polytechnic

re

3

-p

430023, China

lP

University), Ministry of Education, College of Food Science and Engineering, Wuhan

4

na

Polytechnic University, Wuhan 430023, P.R. China Hubei Key Laboratory for Processing and Transformation of Agricultural Products

Jo ur

(Wuhan Polytechnic University), College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan 430023, P.R. China

Corresponding address: College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan, 430023, P. R. China Tel: +86-27-83924790



Corresponding author. 1

Journal Pre-proof

Fax: +86-27-83924790

Jo ur

na

lP

re

-p

ro

of

E-mail: [email protected]

2

Journal Pre-proof

Abstract Sesame was one of the most important crops in Africa and east Asia. The sesamin and sesamolin in sesames have shown various pharmacological, biological and physiologic activities. In this study, a rapid and nondestructive method for determination of sesamin and sesamolin in Chinese sesames by near-infrared

of

spectroscopy coupled with chemometric method was proposed. The near infrared spectra of sesame samples from three different Chinese areas were collected and the

ro

partial least squares (PLS) was used to construct the quantitative models. The spectral

-p

preprocessing and variable selection methods were adopted to improve the

re

predictability and stability of the model. Reasonable quantitative results can be

lP

obtained when the samples used for model construction and prediction were harvested

na

in same years. For sesamin and sesamolin, the correlation coefficient (R) and root mean square error prediction (RMSEP) were 0.9754, 0.9636 and 151.2951, 39.7720,

Jo ur

respectively. The optimized models seem less effective when they were used to predict the samples harvested in other years or countries. However, acceptable results can still be obtained.

Key words: near-infrared spectroscopy; sesame; sesamin; sesamolin; partial least squares.

3

Journal Pre-proof

1. Introduction Sesame (Sesamum indicum L.) has been widely consumed as a traditional food and oil crop in Africa and east Asia for thousands of years [1]. Sesame contains various active constituents, while the sesamin and sesamolin were the two typical and major lignans which play significant roles in seed physiology. First discovered and

of

isolated in 1950’s, the two components have shown strong bioactivities, pharmacological and physiologic activities by various researches [2, 3]. For instances,

ro

Suja et al. reported the free radical scavenging behavior of sesamin and sesamolin in

-p

DPPH system [4]. Yokota discovered that lignans in the sesame could down-regulates

re

cyclin D1 protein expression in human tumor cells. The neuroprotective effects of

lP

sesamin and sesamolin on gerbil brain in cerebral ischemia [5] was reported by Cheng

na

et al. [6]. Other activities, such as increasing hepatic fatty acid oxidation enzymes [7], enhancing antioxidant activity of vitamin E in lipid peroxidation systems [8],

reported.

Jo ur

lowering cholesterol levels [9] and antihypertensive effects [10, 11] were also

As the potentials in health promotion, the two lignans have been used as active ingredients in antioxidants, antiseptics, bactericides, viricides, disinfectants, moth repellants and anti-tubercular agents [12]. Therefore, it is essential and significant to determine the sesamin and sesamolin in sesame seeds. The chromatographic methods were the most widely used analysis method for the determination of the sesamin and sesamolin and proved to be accurate and stable in actual applications. Various chromatographic methods, including thin-layer chromatography (TLC) [13, 14], gas 4

Journal Pre-proof

chromatography (GC) [15], normal-phase liquid chromatography (NPLC) [16], reversed phase liquid chromatography (RPLC) [17, 18] and high performance liquid chromatography (HPLC) [19-21] has been adopted to analysis the two lignans in sesame all over the world. The chromatographic methods, however, always suffer from

the

sophisticated

pretreatment,

time-consuming

procedures

and

of

environment-hazardous reagents. These shortages restrict the large-scale applications of the chromatographic methods in rapid determination. Thus, to satisfy the need of

ro

rapid determination, developing simple, timesaving and non-destructive analysis

-p

method for the sesamin and sesamolin determination is inevitable.

re

As a non-destructive, rapid, eco-friendly and low-costing analytical technique,

lP

near-infrared spectroscopy has been applied in various fields with large-scale [22-24].

na

For food analysis, large number of applications, including meat [25], vegetable [26], fruit [27, 28], fish [29, 30], et al., have been reported. Generally, the near infrared

Jo ur

spectrum is absorbance weakness and peak broadness. Therefore, a reliable calibration model has to be used to get the quantitative or qualitative information from the corresponding spectrum. Support vector regression (SVR), principal component regression (PCR) and partial least squares (PLS) and artificial neural network (ANN) were the most commonly used calibration methods. In some situations, the multiplicative scattering, the background shift or the spectral redundancy in the spectra may reduce the accuracy and stability of the model. To optimize the calibration models, several chemometric methods, including the spectral pretreatment [31-33], outlier detection [34, 35], variable selection [36-42], and ensemble modeling 5

Journal Pre-proof

[43-45], et al., can be used to improve the models. The possibility for determination of the sesamin and sesamolin has been investigated before [46]. The improvements, however, are still needed. Thus, the possibility for accurate, and rapid non-destructive determination of the sesamin and sesamolin in Chinese sesames using near-infrared spectroscopy coupling with

of

chemometric method was investigated in this study. Partial least squares (PLS) was used to construct the near infrared quantitative analysis model. To optimize the PLS

ro

models, 1st derivative, continuous wavelet transform (CWT), multiplicative scatter

-p

correction (MSC) and standard normal variate (SNV) were used to preprocess the

re

spectra. Moreover, three wavelength selection methods, the Monte Carlo

lP

uninformative variable elimination (MC-UVE) [39] the randomization test (RT) [40],

na

and the competitive adaptive reweighted sampling (CARS) [38] were adopted to select the useful variables and further improve the predictability and stability of the

Jo ur

model. To evaluate to proposed method, the concentrations of sesamin and sesamolin obtained by the method and HPLC analysis were compared. 2. Experimental and calculations 2.1 Reagents and samples Sesamin (purity > 98%) and sesamolin (purity > 95%) were purchased from aladdin (China). Methyl alcohol were obtained from Tianjin Chemical Reagent Corporation (China). A total number of 743 sesame seeds were collected. These samples belong to four different datasets. The first three datasets contain samples harvested from 2018, 2017 and 2016, respectively. All the samples in these three 6

Journal Pre-proof

datasets come from three Chinese areas (middle reaches of Changjiang River, Northeast China and Huang-Huai Valleys). The middle reaches of Changjiang River include Hubei, Hunan, Jiangxi and south part of Anhui province. The Northeast China include Heilongjiang, Jilin and Liaoning province and the Huang-Huai Valleys include Henan, south part of Shandong and north part of Anhui province. Additionally,

of

dataset four contains 151 samples collected from East Africa (Ethiopia and Sudan).

re

-p

Table 1

ro

Table 1 shows the detail information of the samples in each dataset.

lP

2.2 HPLC analysis

na

The reference values of sasamin and sesamolin in sesame samples were analyzed by the HPLC method. The extraction procedures of Ref. [20] was adopted to extract

Jo ur

the two lignans from the samples. Then the extractives were analyzed by HPLC according to Ref. [21]. The analysis results were also listed in Table 1. 2.3 Spectral measurement

A NIR spectrometer in diffuse reflection mode (Antaris Ⅱ, Thermo Fisher, USA) was adopted to collect the spectra. Each spectrum ranges from 3999.6 cm-1 to 9999.1 cm-1 and consists of 3112 data points with the digitization interval ca. 1.93 cm -1. The scan number was 64 and the measurement was performed at room temperature. 2.4 Calculations The 193 samples in dataset 1 were divided into a calibration and a prediction set 7

Journal Pre-proof

randomly. The calibration set contains 150 samples and the spectra of the corresponding samples were processed by four spectral preprocessing methods (1st derivative, CWT, MSC and SNV). Then the raw and preprocessed spectra were used to construct different PLS models, respectively. The model with the best cross validation results, would be further optimized by MC-UVE, RT and CARS methods.

of

The prediction set, which contains 43 samples, was adopted to evaluate the optimized models. The concentrations of sesamin and sesamolin in samples of prediction set

ro

were calculated by optimized model and the compared with the results obtained by

-p

HPLC method. Moreover, For the sesame seeds harvested from different years or

re

from different countries, samples in dataset 2, 3 and 4 were also used the evaluate

lP

optimized models. Four parameters, the correlation coefficient (R), root mean squared

na

error of cross validation (RMSECV), root mean squared error of prediction (RMSEP) and ratio of prediction to deviation (RPD), were adopted as criterions for the

Jo ur

evaluation of the models.

3. Result and discussion

3.1 Determination of latent variable The raw spectra, and the spectra preprocessed by 1st, CWT, MSC and SNV were used to construct the PLS models of sesamin and sesamolin, respectively. The latent variable is the key parameter in the construction procedures of partial least square. In this study, Monte Carlo Cross Validation (MCCV) [47] was adopted to determine the number of the latent variables for the different PLS models. This method determines the number of the latent variable by calculating the predicted residual sum of squares 8

Journal Pre-proof

(PRESS) of a PLS model. When an appropriate latent variable was chosen, the PRESS should be small enough. For the PLS models of sesamin and sesamolin, the variations of PRESS along with the increasing of latent variable were plotted in Fig. 1(a) and (b), respectively. For the models of sesamin, it can be seen from Fig. 1(a) that the variation trend

of

of PRESSs were similar. The values were large when the latent variable is small, and then decrease along with the increasing of the latent variable. Finally, the values will

ro

fluctuate in a narrow scope and for different models, the fluctuation scopes are

-p

slightly different. For instance, the PRESS of the model constructed by raw spectra

re

(blue line), decrease from 4.6107 to 1.6107 when the latent variable varies from one

lP

to seven. Then the value undulates slightly around 1.4107 when the latent variable

na

was larger than 10. Therefore, the latent variables for the PLS models of raw spectra and the spectra preprocessed with 1st, CWT, MSC and SNV were 12, 12, 13, 14 and

Jo ur

14, respectively. As to the models of sesamolin, similar situations were also observed in Fig. 2(b). Thus, the latent variables for the five PLS models are 14, 15, 15, 16 and 16.

Figure 1

3.2 Effect of spectral pretreatment Previous researches have shown the effects of spectral pretreatment to the PLS models [22-24, 31-33]. Therefore, the raw spectra were preprocessed by 1st derivative, 9

Journal Pre-proof

CWT, MSC and SNV, and then adopted to construct the PLS model. The cross validation was adopted to evaluate the models and the evaluation results were shown in Table 2. For comparison, the evaluation results of the model constructed by the raw spectra were also calculated. Clearly, the models constructed by the preprocessed spectra show higher predictability and accuracy. For sesamin, the RMSECV, R

of

(calibration set) and RPD of the model constructed by raw spectra were 310.5864, 0.8750 and 2.2991. As to the models constructed by the spectra preprocessed with 1st

ro

and CWT, the RMSECV decrease to 216.8498 and 211.9511, meanwhile the R

-p

(calibration set) and RPD increase to 0.9592, 0.9625 and 3.4213, 3.5485, respectively.

re

Besides, MSC and SNV also show the improvements in predictability and accuracy of

lP

PLS models with smaller RMSECVs and higher Rs (calibration set) and RPDs, but

na

the improvements seem less efficiency than 1st and CWT. In detail comparison, CWT is slightly better than 1st derivative. For sesamolin, the improvements of spectral

Jo ur

preprocessing methods to the predictability and accuracy of PLS models can be also observed and CWT have the best optimizing results. Thus, for both sesamin and sesamolin, CWT was used to preprocess the spectra of the sesame seeds.

Table 2

3.3 Effect of variable selection To further optimize the PLS model, MC-UVE, RT and CARS methods were used to select the useful variables from the spectra preprocessed by CWT. Fig. 2(a) and (b) 10

Journal Pre-proof

plot the selected variables of sesamin and sesamolin, respectively. The mean spectrum preprocessed by CWT were also plotted as comparison. For sesamin, the selected variables located in 4000-5200 cm-1 are considered to be the combination of CC stretching vibrations and CH stretching vibrations [48]. The selected variables around 6000 cm-1 are assigned to be the first overtone of CH stretching vibrations. As for the

of

selected variables around 8000-9000 cm-1 and 10000 cm-1, the two regions were generally represented the second and third overtone of CH stretching vibrations,

ro

respectively [49]. On the other hand, the distributions of the variables selected by

-p

three methods are similar. In detail comparison, the results of MC-UVE and RT

re

method seem more similar and the CARS method selected less variables than other

lP

two methods. For sesamolin, it can be seen from Fig. 2(b) the distributions of the

na

selected variables are similar to the results plotted in Fig. 2(a), which may be explained by the fact that the structures of the two compounds are close. But the

Jo ur

selected variables in each region may be a little different due to the little differences in the structures. For instance, in the wavenumber regions of the combination of CC stretching vibrations and CH stretching vibrations, all the three wavelength selection methods have selected variables for both sesamin and sesamolin. But the selected variables in Fig. 2(a) and (b) are different.

Figure 2

Table 3 lists the numbers of selected variables using this MC-UVE, RT and 11

Journal Pre-proof

CARS methods. Identically, the RMSECV, R (calibration set) and RPD of the PLS models optimized with CWT coupling with three variable selection methods are also shown in the table. The efficiency of the variable selection methods can be clearly seen by comparing the three parameters in Table 2 and 3. Taking sesamin as an example, the RMSECV, R (calibration set) and RPD of the PLS model constructed by

of

the spectra preprocessing with CWT in Table 2 are 211.9511, 0.9625 and 3.5485, respectively. When 339 variables were selected by MC-UVE method, the three

ro

parameters vary to 141.5915, 0.9728 and 4.0697. Similarly, the three parameters could

-p

change to 133.9682, 0.9802 and 4.7667 when 335 variables were selected by RT

re

method. Moreover, the CARS method selected 63 method and the three parameters

lP

vary to 138.6546, 0.9742 and 4.1897, respectively. For sesamolin, the improvements

na

are also obtained. In detail comparison, RT method seems more efficiency for sesamin,

Jo ur

and CARS method seems more efficiency for sesamolin with better prediction results.

Table 3

3.4 Validation studies of optimized models The samples in prediction set were adopted to evaluate the optimized PLS models by calculating the concentrations of the two lignans and then compare with the results obtained by HPLC method. The RMSEPs, Rs (prediction set) and RPDs of the optimized models were listed in Table 2 and 3. As comparison, the results of the PLS model constructed by the raw spectra were also shown in Table 2. For both sesamin 12

Journal Pre-proof

and sesamolin, the improvements of the PLS models in predictability and accuracy can be observed clearly by comparing the parameters in the two tables. Taking sesamin as an example, the RMSEP, R (prediction set) and RPD of the PLS model constructed by the raw spectra are 348.4468, 0.8754 and 2.0746. When CWT method was used to preprocess the spectra, the three parameters vary to 224.7808, 0.9447 and

of

3.0401. When 335 variables were selected by RT method, the three parameters further vary to 151.2951, 0.9754 and 4.5508, respectively. On the other hand, for most PLS

ro

models, the values of RMSECVs and RMSEPs, the values of Rs in calibration and

-p

prediction set, and the values of RPDs in calibration and prediction set are close. The

re

combination of CWT and RT method seems to have the best optimization results for

lP

the samples in calibration set. The PLS model optimized by this combination also

na

have the best prediction results for the samples in prediction set. As for sesamolin, the improvements are similar, and the combination of CWT and CARS method seems to

Jo ur

have better prediction results than other combinations. Fig. 3(a) and (b) plot the scatters of sesamin and sesamolin contents obtained by the two optimized methods, respectively. In each subgraph, the red and green points represent the calculation results of the samples in calibration and prediction set. Two fitting lines (red and green) and a diagonal line (dark) were also plotted. For sesamin, it was found in Fig. 3(a) that the three lines are almost overlapped, and most red and green points are reasonably distributed along the three lines. A few samples locate a little far from the three lines. As for sesamolin, similar results were obtained and the red and green lines seem more closer to the dark line comparing with the two lines in 13

Journal Pre-proof

Fig. 3(a).

Figure 3

The prediction results for the samples in prediction set indicated that the

of

optimized models can correctly calculate the lignans in the samples harvested in the same year and from same area. To further evaluate the optimized models, the

ro

optimized PLS models were adopted to predict the samples harvested in different

-p

years (dataset 2 and 3), or from different areas (dataset 4). Table 4 listed the prediction

re

results of the three datasets. The prediction results obtained by other combinations

lP

were also listed as comparison. Obviously, for the samples harvested in 2017 (dataset

na

2), when the combination of CWT and RT method were adopted to predict the sesamin, the RMSEP, R and RPD were 170.0418, 0.9660 and 3.5984, respectively. As

Jo ur

to the samples in dataset 3, the three parameters vary to 186.2456, 0.9565 and 3.3677. Comparing with the prediction results in Table 3, the optimized models seem less effective with larger RMSEP and smaller R, RPD. As 2017 is closer to 2018 than 2016, the results of dataset 2 was slightly better. For sesamolin, the prediction results were similar. Moreover, when the optimized model (CWT+CARS) were used to predict lignans of samples from other areas, the results seem more worse. For sesamin, the RMSEP, R and RPD were 191.6077, 0.9472 and 2.9535, and for sesamolin, the three parameters were 67.8281, 0.8737 and 2.0315, respectively.

14

Journal Pre-proof

Table 4

Fig. 4(a) and (b) plot the scatters of sesamin and sesamolin contents obtained by the two optimized methods, respectively. Three fitting lines (olive, orange and navy-blue) and a diagonal line (dark) were also plotted. It can be seen in each

of

subgraph that most olive, orange and navy-blue points are reasonably distributed along the three lines. But compared with the points in Fig. 3(a) and (b), some points

ro

seem a little far from the lines, which indicated larger prediction errors of the

-p

optimized models. Moreover, the three lines in each subgraph seem a little far from

re

the dark line. Thus, systematic error may exist when the optimized models were used

lP

to predict samples in dataset 2, 3 and 4. Table 4 and Fig. 4 indicated that the

na

optimized models became less effective when the they were used to predict the samples harvested in different year or from other areas. As most correlation

Jo ur

coefficients (R) were larger than 0.90, the prediction results were still reluctant.

Figure 4

4. Conclusion The possibility for accurate rapid and non-destructive determination of sesamin and sesamolin in Chinese sesame seeds by near infrared spectroscopy coupling with chemometric methods was investigated in this study. The partial least squares method was used to construct calibration models. The spectral preprocessing methods, 15

Journal Pre-proof

including 1st, CWT, MSC, SNV, and variable selection techniques, including MC-UVE, RT and CARS methods, were adopted to optimize the PLS models. The CWT method seems to have the best preprocessing results, and for variable selection techniques, the RT and CARS method seems more efficiency for sesamin and sesamolin with better prediction results, respectively. The optimized PLS model were

of

used to calculate the concentrations of sesamin and sesamolin in prediction samples, and the calculated results of sesamin and sesamolin are close to the concentrations

ro

obtained by HPLC analysis. When the optimized PLS model were used to predict

-p

samples harvested in different years or from other areas, reluctant results were also

re

obtained. Thus, the proposed method provides an alternative choice for fast and

na

5. Acknowledgements

lP

non-destructive analysis of lignans in sesame.

This study was supported by The National Key Research and Development

Jo ur

Program of China (2017YFC1600500), Open Foundation of Key Laboratory of Deep Processing of Major Grain and Oil (Wuhan Polytechnic University), Ministry of Education (2018JYBQGDKFB01) and National Natural Science Foundation of China (No. 31701596). References [1] K.R. Anilakumar, A. Pal, F. Khanum, A.S. Bawa, Agric. Conspec. Sci. 75 (2010) 159–168. [2]

P. Budowski, K.S. Markley, Chem. Rev. 48 (1951) 125–151.

[3]

P. Budowski, J. Am. Oil Chem. Soc. 41 (1964) 280–285. 16

Journal Pre-proof

[4]

K.P. Suja, A. Jayalekshmy, C. Arumughan, J. Agr. Food Chem. 2004 (52) 912– 915.

[5]

T. Yokota, Y. Matsuzaki, M. Koyama, T. Hitomi, M. Kawanaka, M. Enoki-Konishi, Y. Okuyama, J. Yakayasu, H. Nishino, A. Nishikawa, T. Osawa, T. Sakai, Cancer Sci. 2007 (98) 1447–1453. F.C. Cheng, T.R. Jinn, R.C. Hou, J.T.C. Tzen, Int. J. Biomed. Sci. 2006 (2) 284–

of

[6]

288.

L. Ashakumary, I. Rouyer, Y. Takahashi, T. Ide, N. Fukuda, T. Aoyama, T.

ro

[7]

Ghafoorunissa, S. Hemalatha, M.V.V. Rao, Mol. Cell. Biochem. 2004 (262)

re

[8]

-p

Hashimoto, M. Mizugaki, M. Sugano, Metabolism 1999 (48) 1303–1313.

N.P. Visavadiya, A.V.R.L. Narasimhacharya, Food Chem. Toxicol. 2008 (46) 1889–1895.

na

[9]

lP

195–202.

Jo ur

[10] C.C. Lee, P.R. Chen, S. Lin, S.C. Tsai, B.W. Wang, W.W. Chen, C. Tsai, K.G. Shyu, J. Hypertens. 2004 (22) 2329–2338. [11] D. Nakano, D. Kurumazuka, Y. Nagai, A. Nishiyama, Y. Kiso, Y. Matsumura, Clin. Exp. Pharmacol. P. 2008 (35) 324–326. [12] D. Bedigian, D.S. Seigler, J.R. Harlan, Biochem. Syst. Ecol. 1985 (13) 133– 139. [13] T. Fukuda, M. Nagata, T. Osawa, M. Namiki, J. Am. Oil Chem. Soc. 1986 (63) 1027–1031. [14] A. Kamal-Eldin, G. Yousif, L.A. Appelqvist, J. Am. Oil Chem. Soc. 1991 (68) 17

Journal Pre-proof

844–847. [15] S. Hammann, M. Englert, M. Muller, W. Vetter, Anal. Bioanal. Chem. 2015 (407) 9019–9028. [16] P. Gornas, A. Siger, I. Pugajeva, D. Segliņa, Food Addit. Contam. 2014 (31) 567–573.

of

[17] R. Wu, F. Ma, L. Zhang, P. Li, G. Li, Q. Zhang, W. Zhang, X. Wang, Food Chem. 2016 (204) 334–342.

ro

[18] M. Takahashi, Y. Nishizaki, N. Sugimoto, H. Takeuchi, K. Nakagawa, H.

-p

Akiyama, K. Sato, K. Inoue, J. Sep. Sci. 2016 (39) 3898–3905.

re

[19] A. Kamal-Eldin, L.A. Appelqvist, J. Am. Oil Chem. Soc. 1994 (71) 149–156.

lP

[20] S. Hemalatha, Ghafoorunissa, J. Am Oil Chem. Soc 2004 (81) 467–470

na

[21] N. Rangkadilok, N. Pholphana, C. Mahidol, W. Wongyai, K. Saengsooksree, S. Nookabkaew, J. Satayavivad, Food Chem. 2010 (122) 724–730.

Jo ur

[22] Y. Roggo, P. Chalus, L. Maurer, C. Lema-Martinez, A, Edmond, N. Jent, J. Pharmaceut. Biomed. 2007 (44) 683–700. [23] H. Huang, H. Yu, H. Xu, Y. Ying, J. Food Eng. 2008 (87) 303–313. [24] N. Prieto, R. Roehe, P. Lavin, G. Batten, S. Andres, Meat Sci. 2009 (83) 175– 186. [25] N. Prieto, O. Pawluczyk, M.E.R. Dugan, J.L. Aalhus, Appl. Spectrosc. 2017(71) 1403–1426. [26] R. Moscetti, R.P. Haff, D. Monarca, M. Cecchini, R. Massantini, Postharvest Biol. Tec. 2016 (120) 204–212. 18

Journal Pre-proof

[27] Y. Guo, Y.N. Ni, S. Kokot, Spectrochim. Acta A 2016 (153) 79–86. [28] R. Beghi, V. Giovenzana, A. Tugnolo, R. Guidetti, J. Sci. Food Agr. 2018 (98) 2729–2734. [29] X. Huang, H. Xu, L. Wu, H. Dai, L. Yao, F. Han, Anal. Methods, 2016 (8) 2929–2935.

of

[30] J.H. Cheng, D.W. Sun, LWT-Food Sci. Technol. 2015 (62) 1060–1068. [31] X.G. Shao, A.K.M. Leung, F.T. Chau, Acc. Chem. Res. 36 (2003) 276–283.

ro

[32] Z.P. Chen, J. Morris, E. Martin, Anal. Chem. 78 (2006) 7674–7681.

-p

[33] A. Rinnan, F. van den Berg, S.B. Engelsen, TrAC Trends Anal. Chem. 28 (2009)

re

1201–1222.

lP

[34] Z.C. Liu, W.S. Cai, X.G. Shao, Sci. China Chem. 51 (2008) 751–759.

2841–2847.

na

[35] X.H. Bian, W.S. Cai, X.G. Shao, D. Chen, E.R. Grant, Analyst 135 (2010)

Jo ur

[36] Y.H. Yun, W.T. Wang, M.L. Tan, Y.Z. Liang, H.D. Li, D.S. Cao, H.M. Lu, Q.S. Xu, Anal. Chim. Acta 807 (2014) 36–43. [37] Q.J. Han, H.L. Wu, C.B. Cai, L. Xu, R.Q. Yu, Anal. Chim. Acta 612 (2008) 121–125. [38] H.D. Li, Y.Z. Liang, Q.S. Xu, D.S. Cao, Anal. Chim. Acta 648 (2009) 77–84. [39] W.S. Cai, Y.K. Li, X.G. Shao, Chemom. Intell. Lab. Syst. 90 (2008) 188–194. [40] H. Xu, Z.C. Liu, W.S. Cai, X.G. Shao, Chemom. Intell. Lab. Syst. 97 (2009) 189–193. [41] L. Norgaard, A. Saudland, J. Wagner, J.P. Nielsen, L. Munck, S.B. Engelsen, 19

Journal Pre-proof

Appl. Spectrosc. 54 (2000) 413–419. [42] J.H. Jiang, R.J. Berry, H.W. Siesler, Y. Ozaki, Anal. Chem. 74 (2002) 3555– 3565. [43] H.D. Li, Y.Z. Liang, D.S. Cao, Q.S. Xu, TrAC Trends Anal. Chem. 38 (2012) 154–162.

of

[44] M. Jing, W.S. Cai, X.G. Shao, Chemom. Intell. Lab. Syst. 100 (2010) 22–27. [45] X.G. Shao, X.H. Bian, W.S. Cai, Anal. Chim. Acta 666 (2010) 32–37.

ro

[46] K.S. Kim, S.H. Park, M.G. Choung, J. Agric. Food Chem. 54 (2006) 4544–

-p

4550.

re

[47] Q.S. Xu, Y.Z. Liang, Chemom. Intell. Lab. Syst. 56 (2001) 1–11.

lP

[48] W. Kaye, Spectrochim. Acta 6(1954) 257–287.

na

[49] D. Bassi, L. Menegotti, S. Oss, M. Scotini, F. Iachello, Chem. Phys. Lett.

Jo ur

(207)1993 167–172.

20

Journal Pre-proof

Table 1 Detail information of samples in four datasets Dataset

1

2

3

Area middle reaches of Changjiang River Northeast China Huang-Huai Valleys middle reaches of Changjiang River Northeast China Huang-Huai Valleys middle reaches of Changjiang River Northeast China Huang-Huai Valleys

Sesamin Mean Range (mg/kg) (mg/kg)

Sesamolin Mean Range (mg/kg) (mg/kg)

Harvested Year

Number of Samples

2018

64

501.5350-2338.7308 1181.5579 442.1108

2018

63

564.2793-5098.0734 1535.9661 989.7029 323.1102-1648.8824 614.5374 281.8037

2018

66

448.9690-2188.6924 1212.3636 410.7286

2017

66

526.6134-2164.4793 1158.3958 327.0240

2017

70

2017

e

r P

f o

o r p

n r u

o J

386.2513-840.1394

507.2173

SDa (mg/kg) 88.3933

354.2153-530.0388

530.0388 112.7894

379.8163-519.1792

519.1792

80.5247

565.9551-4832.9337 1591.7395 805.4976 370.8917-1232.9687 615.0026 211.0706 570.5511-1908.5324 1205.4512 329.4276

322.0535-739.9504

536.8959

90.3521

66

469.5978-2003.0924 1098.0744 323.7285

339.9873-734.2633

516.8574

81.5316

2016

63

608.4513-4048.5795 1461.4115 742.4133 340.2382-1323.3634 592.9069 194.8539

2016

65

675.9492-1783.8225 1204.5685 291.3762

2016

69

l a

SD (mg/kg)

21

340.2496-770.0891

523.8376

88.0517

Journal Pre-proof

4 a

Ethiopia/Sudan Unknown

151

528.0992-3558.1668 1342.8871

565.905

320.731-1112.2972

Standard deviation

f o

l a

e

o r p

r P

n r u

o J

22

549.8443 137.7909

Journal Pre-proof

Table 2 Prediction results of PLS models optimized by different spectral preprocessing methods.

of

Prediction Set RMSEP R RPD 348.4468 0.8754 2.0746 233.1341 0.9419 2.9714 224.7808 0.9447 3.0401 301.5235 0.8818 2.3904 372.0474 0.9018 2.1439 102.0307 0.8132 1.7231 70.2065 0.8866 2.5041 68.3724 0.8912 2.5713 99.8272 0.8500 1.5656 94.0381 0.8529 1.9197

ro

-p re lP

Sesamolin

Raw 1st CWT MSC SNV Raw 1st CWT MSC SNV

Calibration Set RMSECV R RPD 310.5864 0.8750 2.2891 216.8498 0.9592 3.4213 211.9511 0.9625 3.5485 307.9477 0.8876 2.2795 295.8644 0.9108 2.1779 98.8007 0.8383 1.4798 56.4289 0.9274 2.7282 53.0297 0.9313 2.8795 96.8797 0.8448 1.9044 92.3154 0.8680 2.2597

na

Sesamin

Method

Jo ur

Lignan

23

Journal Pre-proof

Table 3 Prediction results of PLS models optimized by CWT method coupling with two variable selection methods Lignan

Sesamin

Sesamolin

Methods CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CET+CARS

Number of Variables 339 335 63 205 266 115

Calibration set RMSECV R 141.5915 0.9728 133.9682 0.9802 138.6546 0.9742 50.0668 0.945 46.4240 0.9581 39.6023 0.9742

l a

n r u

o J

24

RMSEP 179.0655 151.2951 167.3339 59.8384 53.7935 39.7720

f o

o r p

e

r P

RPD 4.0697 4.7667 4.1897 3.3725 3.7156 3.9392

Prediction set R 0.9571 0.9754 0.9605 0.9362 0.9486 0.9636

RPD 3.7035 4.5508 3.8330 2.8505 3.3675 3.7043

Journal Pre-proof

Table 4 Prediction results of dataset 2, 3 and 4

Sesamolin

Sesamin Dataset 3 Sesamolin

Sesamin Dataset 4

Jo ur

na

lP

Sesamolin

25

R 0.9559 0.9660 0.9637 0.9147 0.9177 0.9285 0.9471 0.9565 0.9493 0.9032 0.9033 0.9183 0.9470 0.9481 0.9472 0.8582 0.8615 0.8737

of

Dataset 2

RMSEP 185.8253 170.0418 182.0153 58.2094 58.1769 55.5021 195.9905 186.2456 190.3762 61.6743 58.8528 55.3507 194.3238 191.1124 191.6077 71.2143 70.4654 67.8281

ro

Sesamin

Method CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS CWT+MC-UVE CWT+RT CWT+CARS

-p

Lignan

re

Dataset

RPD 3.3941 3.5984 3.4597 2.4566 2.4978 2.6558 3.1462 3.3677 3.0816 2.3161 2.3174 2.5062 2.9122 2.9611 2.9535 1.9349 1.9554 2.0315

Journal Pre-proof

Figure captions: Fig. 1 Variations of PRESS along with latent variable number Fig. 2 Selected variables of spectra preprocessed with CWT using MC-UVE, RT and CARS methods Fig. 3 Scatter plots of sesamin (a) and sesamolin (b) contents obtained by HPLC

of

method and optimized models (dataset one) Fig. 4 Scatter plots of sesamin (a) and sesamolin (b) contents obtained by HPLC

Jo ur

na

lP

re

-p

ro

method and optimized models (dataset two, three and four)

26

Journal Pre-proof

Jo ur

na

lP

re

-p

ro

of

Figure 1

27

Journal Pre-proof

Jo ur

na

lP

re

-p

ro

of

Figure 2

28

Journal Pre-proof

Jo ur

na

lP

re

-p

ro

of

Figure 3

29

Journal Pre-proof

Jo ur

na

lP

re

-p

ro

of

Figure 4

30

Journal Pre-proof

Declaration of intesest statement The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a

Jo ur

na

lP

re

-p

ro

of

conflict of interest in connection with the work submitted.

31

Journal Pre-proof

Jo ur

na

lP

re

-p

ro

of

Graphical abstract

32

Journal Pre-proof

Highlights 1. Sesamin and sesamolin in sesame from three Chinese provinces were calculated by NIR and PLS. 2. Spectral pretreatment and variable selection method were adopted to optimize the model.

of

3. Prediction results of sesamin and sesamolin by optimized model and HPLC

Jo ur

na

lP

re

-p

ro

method were close.

33