Simultaneous determination of amino acid mixtures in cereal by using terahertz time domain spectroscopy and chemometrics

Simultaneous determination of amino acid mixtures in cereal by using terahertz time domain spectroscopy and chemometrics

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15 Contents lists available at ScienceDirect Chemometrics and Intelligent Laboratory Sy...

1016KB Sizes 0 Downloads 35 Views

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15

Contents lists available at ScienceDirect

Chemometrics and Intelligent Laboratory Systems journal homepage: www.elsevier.com/locate/chemometrics

Simultaneous determination of amino acid mixtures in cereal by using terahertz time domain spectroscopy and chemometrics Xin Zhang, Shaohua Lu, Yi Liao, Zhuoyong Zhang

MARK



Department of Chemistry, Capital Normal University, Beijing 100048, China

A R T I C L E I N F O

A B S T R A C T

Keywords: Terahertz time-domain spectroscopy (THzTDS) Amino acid Partial least-squares (PLS) Support vector machine (SVM) Multivariate curve resolution alternating least squares (MCR-ALS)

Terahertz (THz) spectroscopy displays special features comparing to the commonly used spectroscopies like infrared and Raman and provides potential applications in broad areas. In this work, terahertz time-domain spectroscopy (THz-TDS) has been utilized for the qualitative and quantitative analyses of ternary mixtures of L-Glutamic acid, L-Glutamine, and L-Tyrosine which have similar chemical structures and properties. Mixtures of amino acids were prepared with yellow foxtail millet matrix to simulate more natural situation instead of polyethylene (PE). Partial least squares (PLS) and support vector machine (SVM) were compared for quantitative analysis in this work. Preprocessing methods, multiplicative scatter correction (MSC), Savitzky-Golay (S-G) smoothing, first derivative and wavelet transform, were investigated based on bootstrapped Latin partitions and external test. The SVM model with MSC preprocessing yielded a stable model and gave accurate prediction for the ternary amino acid in foxtail millet analyzed by THz-TDS. The THz absorption spectra and corresponding concentration profiles of three amino acids components from mixtures were resolved by multivariate curve resolution alternating least squares (MCR-ALS). The results show that THz spectroscopy could be applicable for analyzing more nutritional compounds in different cereals in the future.

1. Introduction Terahertz radiation has frequency ranging from 0.1 to 10 THz (1 THz=1012 Hz, 0.03 mm to 3 mm in wavelength) and lies between the microwave and infrared portions of electromagnetic spectrum. Useful information can be provided by THz spectroscopy, such as intra-molecular and inter-molecular modes which resulted from hydrogen bonding stretches, Van der Waals force and torsional and collective vibrational modes [1–4]. Since no light ionization and damage to biomolecules [5] occur, THz wave has potential applications for nondestructive detection and screening in medicine, environment, agriculture, pharmacy, food etc. THz spectroscopy has been used for investigating physical and chemical properties of bio-specimens, such as DNA [6], amino acids and proteins [7–9], crystalline samples [10], pesticides [11,12] and even the diagnosis of cervical carcinoma [13,14]. Based on the advantages of THz spectroscopy, the potential of using this technique for food processing monitor and quality control is worth to be studied. As the structure and function unit of protein molecules, amino acids contribute the protein specific molecular structure and biochemical activity. Amino acids not only provide essential nutrients for human beings, but also can be used for healing some kinds of diseases. Raman



[15–17], far infrared [18–20], mid-infrared, near infrared spectroscopic techniques were reported on the quantitative or qualitative analysis of amino acids. Most of the spectroscopic methods hold the advantage of fast, non-destructive, none or a few sample preprocessing etc. However, sometimes the traditional spectroscopic techniques still have problem to identify the difference between amino acids with very similar structures, so that we proposed using THz spectroscopy combining with chemometrics for analyzing ternary amino acid mixtures. Foxtail millet (botanic name Setariaitalica, formerly Panicumitalicum L.) is one of the most important cereals and the second most widely planted species of millet in East Asia [21,22]. Foxtail millet supplies high quality vegetable proteins containing most of amino acids that meet or exceed the FAO/WHO standards except lysine. Though quite a number of amino acids have been investigated by THz spectroscopy in the published work [7,23–26], in which the authors typically used polyethylene (PE) as matrix taking the advantage of its low absorption in THz region. However, in the natural conditions, the compounds to be determined are normally mixed with complex mixtures. To realize the possibility for applying THz-TDS to detect amino acids in food samples, L-Glutamic acid (Glu), L-Glutamine (Gln), and L-Tyrosine (Tyr) were studied using millet as matrix. Glu is one of the main energy source of animal mucosa providing energy for the animal body, and it is highly

Corresponding author. E-mail addresses: [email protected], [email protected] (X. Zhang), [email protected] (Z. Zhang).

http://dx.doi.org/10.1016/j.chemolab.2017.03.001 Received 1 November 2016; Received in revised form 6 February 2017; Accepted 3 March 2017 Available online 07 March 2017 0169-7439/ © 2017 Elsevier B.V. All rights reserved.

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15

X. Zhang et al.

Fig. 1. Schematic of the experimental THz-TDS setup.

2. Experimental

related to the growth of young animals. Gln can be chosen for medical uses to reverse cachexia (muscle wasting) in patients with advanced cancer by oral medication [27,28]. Also, glutamine oral supplementation can reduce the side-effect of systemic infections induced by chemotherapy [29]. Tyr has a phenol functionality and functions as a receiver of phosphate groups that are transferred by way of protein kinases which is part of signal transduction processes and changes the activity of the target protein. All of those amino acids have potential to be used as oral supplementation in food additive or nutrient supplements [30,31]. In our previous work published in Food Chemistry [32], naturally grown yellow foxtail millet was used as matrix instead of PE to simulate a much more complex situation near to the natural conditions. However, more work needs to be done to get better understanding of the molecular interaction features of THz-TDS. The THz-TDS spectra, especially the characteristic bands of some compounds with activeroups, are difficult to be explained when environment or sample matrix is complicated, because spectral signals and peaks are related with both intra-molecular and inter-molecular interactions. More systematic and insight study is necessary for explanation of the components in complex mixture systems. Absorption intensity of THz spectra is proportional to the concentration of analytes in dynamic ranges used normally, thus regression modeling can be applied in their regression modeling. Chemometrics has been applied for qualitative and quantitative analyses of the THz spectra of mixtures [4,12,32–36]. PLS and SVM are typical linear and nonlinear models normally used for analysis of spectroscopy, and they were compared for concentration prediction on THz spectroscopy analysis on amino acid mixtures. Multivariate curve resolution alternating least squares (MCR-ALS) has been used successfully to resolve multiple component data in unknown mixtures [37]. The spectral profiles recovered by MCR-ALS enable us to compare it with reference spectra and distinguish pure components in the analyzed samples. Preprocessing methods, including multiplicative scatter correction (MSC) [38,39], Savitzky-Golay (S-G) smoothing [40–42], first derivative, and wavelet transform [43] provide possibility to improve the predication accuracy of the calibration models. The performances of the models were evaluated by the root mean square error of prediction (RMSEP) and correlation coefficient (R2). In this work, three amino acids (Glutamic acid, Glutamine, and Tyrosine) were mixed with yellow foxtail millet powder and they were pressed into sample pellets and then used for THz-TDS absorption measurement. The changes of characteristic spectral bands of analytes in foxtail millet matrix were investigated and explained when THz-TDS is applied on a more complex combination. This paper reveals the possibility for applying THz-TDS to detecting amino acids in food samples, and demonstrating a potential of applying THz-TDS combined with chemometrics as a practical tool in quality control of food products.

2.1. Sample preparation The polyethylene powder was purchased from Sigma-Aldrich Inc. L-Glutamic and L-Glutamine (purities > 98.5%) were purchased from Beijing Dingguo Biotechnology Co. Ltd. The yellow cereals samples were provided by the Zhangjiakou Import and Export Commodities Inspection Bureau, China. The foxtail millet sample matrix was ground in size of 0.5–0.75 µm. All samples mentioned above were put in the drying oven in 313 K for 24 h to remove moisture from the materials, and then stored in the air desiccator in room temperature. Thirty-two samples were prepared for quantitative analysis of the ternary amino acids using-TDS. The concentration levels of different components were assigned based on the orthogonal experimental design. We prepared 120 mg yellow foxtail millet powder mixed with proper weight of amino acids, which is 1.5 mg, 3.75 mg, 7.5 mg, and 11.25 mg, respectively, then add foxtail millet powder to make its total weight as 150 mg. Samples were ground to proper size with a mortar after weighing and mixing. The homogenized mixtures were then pressed using a stainless-steel die (Model GS15011, Specac Ltd, Orpington, Kent, UK) in 13 mm diameter with a constant pressure of 4 t for 3 min. The thicknesses of the samples were between 0.95– 1.08 mm. A series of pure amino samples were prepared under the same condition with PE-matrix and foxtail millet for comparison. The content of each amino acid is 3.33%, 6.66%, 10.00%, 13.33%, 16.67%, and 20.00%, respectively. All samples were prepared in duplicate.

2.2. Instrumentation THz-TDS transmission platform schematic was shown in Fig. 1, which was assembled by Daheng New Epoch Technology Inc., China. The details of the in THz-TDS spectroscopy applied in this work can be referred to the Ref. [32]. In this system, a femtosecond laser (SpectraPhysics, MaiTai) was used. Two photoconductive antennas were used to generate and detect THz pulses, respectively. The laser source output was spited into pump and probe beams. The pump beam passed through the sample and collected by another ZnTe-based semiconductor antenna. The detected probe beam provided the reference pulse on the antenna. In the experiments, dry nitrogen was used as reference. The humidity was below 5% during the measurements with constant flow of N2 in the experimental setup. To avoid random errors in measurement, each reference (nitrogen) and sample was measured three times at the different location by turning the samples for 45° in the clockwise direction. Infrared (IR) spectra were measured using KBr pellets with a Bruker TENSOR 27 FT-IR spectrometer from 4000 to 400 cm−1. 9

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15

X. Zhang et al.

2.3. Terahertz spectroscopy

analyzing THz spectroscopic data [51]. The penalty weight C and g were set as 32 and 0.0034, respectively, which were optimized using grid search and cross validation method. The root mean square error of prediction (RMSEP) and correlation coefficient (R2) were applied to evaluate the performance of each prediction model with different pretreatments. An excellent model gives good prediction precision with high R2 (close to 1) and low RMSEP (close to 0).

2.3.1. Calculation of absorption coefficient The extraction of optical parameters from time-domain data is based on the theory developed by Timothy and Duvillaret [44,45]. To obtain the transmission spectrum, the spectral information of reference and samples are necessary. The equipment was initially detected timeamplitude single, which are the Er(t) and Es(t), standing for the reference and sample information. The electric field in the frequency domain can be obtained by using fast Fourier transform (FFT). The equation is given:

Es (ω) = A (ω) eiφ (ω) Er (ω)

2.3.4. Multivariate curve resolution Multivariate curve resolution (MCR) have been successfully applied to resolve the mixture systems detected by different instruments following bilinear model. The MCR methods decompose the raw signals of mixed measurement into pure concentration profiles and pure spectra when applied on spectra analysis. When using MCR, constraints following physical and chemical property can be applied flexibly and the maximum variance from the raw measurement should be explained. The spectra and concentration in the bilinear model resolved by MCR are meaningful physically or chemically, and the components are corresponding to a factor or source [52,53]. The quality of MCR-ALS resolved data fitting was evaluated by the percentage of lack of fit (lof) (Eq. (4)) and the percentage of explained variance (R2) (Eq. (5)) in this work. Lof and R2 were calculated according to the following equations:

(1)

where Es(ω) and Er(ω) are the sample and reference electric fields, A(ω) and φ(ω) are the amplitude and phase difference between the sample and reference signal, respectively. The absorption coefficient α(ω) and refractive index n(ω) of the sample can be calculated as below:

n (ω ) =

cφ (ω) +1 ωd

(2)

α (ω ) =

2ωκ 2 4n ( ω ) = ln c d A (ω)[n (ω) + 1]2

(3)

in which α(ω) and n(ω) are used to depict the characteristics of the dispersion and absorption, ω is angular frequency, c is the speed of light in vacuum, d is the thickness of the sample, κ represents the attenuation coefficient.

lof (%) = 100 ×

∧ ⎞2 ⎛ ∑ij ⎜dij − dij ⎟ ⎠ ⎝

∑ij dij2

⎡ ∑ij eij2 ⎤ ⎥ R2 = 100 × ⎢1 − ⎢⎣ ∑ij dij2 ⎥⎦

2.3.2. Data preprocessing methods Multiplicative scatter correction (MSC) was firstly applied for scaling the offset effects in spectral data [38,39]. MSC can remove physical effects such as particle size or surface blaze. In this work, the average spectrum was used to estimate the scatter of the spectra. All of individual spectrum were shifted and rotated so that they fit as closely as possible to the chosen reference spectrum. The differences in the trend can be corrected by using MSC. The Savitzky-Golay (S-G) smoothing is a polynomial digital approach and it can be applied to smooth spectra for the purpose of removing noise and increasing the signal-to-noise ratio (SNR) without distorting the raw data significantly [40–42]. By using S-G smoothing, the computation on the spectral dataset collected can be simplified and the processed dataset are normally used to build a robust model for prediction. Wavelet transform (WT) was first used for analyzing non-stationary signals changing by time [43]. The location and localization characteristics in the time frequency plane depend on the scale parameter and the time translation parameter. All wavelet transforms have been considered to form time-frequency representation for continuous-time (analog) signals, and it provide efficient descriptions of signals with localized features in the time-frequency plan. Wavelet transform also can be used for data compression and denoising by eliminating wavelet coefficients that contain information about less localized frequency regions [46–48].

(4)

(5) ∧

where dij is the element of the THz spectral data of the samples, and dij is the corresponding element calculated by ALS. Lacks of fit values give a measure of the fit quality using the same units as the measured data, and comparable with relative error estimations of the experiment, eij are the elements of noise. In this work, the amino acids spectra of Glu, Gln, and Tyr for evaluating the MCR model are available by analyzing the pure amino acid with THz spectroscopy in similar experimental condition as the samples. The agreement between the reference spectra and a spectra resolved by MCR-ALS can be calculated using the correlation (r2) coefficient and the vector angle between them, using Eqs. (6) and (7),

r2 =

xyT x y

angle =

⎡ xyT ⎤ 180 ⎥ × arccos⎢ π ⎣x y⎦

(6)

(7)

where x is the vector of resolved profiles by MCR-ALS and y is the vector of reference spectra obtained in lab. 2.3.5. Validation Bootstrapped Latin-partitions (BPLs) are developed for evaluating the accuracy and stability of calibration methods using cross validation and random sampling verification [54]. Unbiased and reliable evaluation should be obtained by systematically model with samples drew from an arbitrary discrete distribution in a well-designed experiment. Five Latin partitions and one hundred times bootstraps repetition according to literature and our expertise were used to measure the prediction accuracy of the models with different preprocessing. All samples were divided into five equal parts randomly for each bootstraps repetition and every part was used as test set only once for prediction, and the rest four parts were used as the training set to build the

2.3.3. Regression modeling and evaluation PLS is a widely used linear regression method and the details of this method can be found elsewhere [49]. PLS is particularly suited when there is multicollinearity among the collected spectra values. PLS extracts orthogonal features from the spectral data and builds the correlation between the features and talent variables [4]. More details can be found from the references [50]. SVM regression can efficiently perform a non-linear model using the kernel function, which transform the data into a higher dimensional feature space. The SVM was adapted in this work and the RBF kernel were applied to check the potential of using nonlinear models for 10

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15

X. Zhang et al.

3. Results and discussion

calibration model. The results of the prediction sets from each partition were pooled. The performance of preprocessing methods and prediction results using different PLS latent variables were evaluated using BPLs cross validation method. Kennard-Stone method [55] partitioned the data as calibration and validation subsets. 22 samples were selected as the training set and applied bootstrapped Latin-partitions for evaluation, and the remaining 10 were used as the external test set. RMSE and R2 of the external test set were used for evaluating the reliability of the models. All model optimization and construction was performed in MATLAB (MATLAB 7.14.0.334, The MathWorks Inc.). LibSVM toolbox [56] was applied for implement of SVM. SuperPLS was applied for bootstrapped Latin-partitions cross validation and PLS latent variables determination [57].

3.1. THz spectra of the two amino acids Fig. 2(A) shows the infrared (IR) spectra of L-Glutamic acid, L-Glutamine, and L-Tyrosine. From Fig. 2, the infrared spectra of L-Glutamic acid, L-Glutamine show very similar profiles and featured absorption peaks. The infrared spectra of L-Tyrosine show lower transmittance from 2100 to 3300 cm−1 comparing to the spectra of other two amino acids. It is difficult to build a quantitative model for all the amino acid in the mixture using infrared spectra, since the two of the selected amino acids have very similar IR spectra and they are collinear. Fig. 2(B) displays the THz spectra of L-Glutamic acid, L-Glutamine, and L-Tyrosine with polyethylene matrix in different concentrations.

Fig. 2. (A) The infrared spectra of L-Glutamic acid, L-Glutamine, and L-Tyrosine in the matrix of KBr. (B) THz spectra of L-Tyrosine (a), L-Glutamic acid (b), and L-Glutamine (c) at different concentrations in the matrix of polyethylene. (C) THz spectra of L-Tyrosine (a), L-Glutamic acid (b), and L-Glutamine(c) at different concentrations in the matrix of foxtail millet.

11

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15

X. Zhang et al.

amino acid are mixed together. The different positions and absorption intensity of their characteristic bands give the potential to discriminate them by using chemometric models. As shown in the Fig. 3(C) the sensitivity of THz spectroscopy to the amino acids in foxtail millet matrix is slightly lowered comparing to that in polyethylene as displayed in Fig. 3(B), due to the influence of other nutrition components in the cereals, such as saccharides, cellulose, and proteins. The compounds in foxtail matrix caused slope background shown in the Fig. 3(A) covered some signature information of the target amino acids.

Higher concentration in the mixture sample contributes to higher absorption. Comparing to infrared spectra, THz spectra provide much more distinctions between L-Glutamic acid, L-Glutamine, and L-Tyrosine. Based on diversely individual characteristics of chemical group, each constituent has unique vibrational modes in the THz region. As shown in Fig. 2(B), the THz spectrum of glutamine samples with matrix of polyethylene displayed three characteristic bands, locate at 1.71 THz, 2.24 THz and 2.52 THz. The characteristic bands of glutamic acid spectra are at 1.23 THz, 2.03 THz, and 2.64 THz. These results are coincidence to the results displayed in the Ref. [32]. But one new phenomenon can be found from this work is that the THz characteristic bands of tyrosine spectra locate at 0.97 THz, 2.07 THz, and 2.67 THz. Although Glu and Gln have more similar chemical structure, Glu and Tyr display some common characteristic bands. Fig. 2(C) illustrates the THz spectra of L-Glutamic acid, L-Glutamine and L-Tyrosine in different concentrations with matrix of foxtail. When we used foxtail millet as the matrix for the analysis, the spectra in range of 0.3 to 1.8 THz, displayed useful information and they were consistent with those on matrix of polyethylene. The signature intensity of the samples in foxtail millet are stronger due to the slope background of the grain. The spectra between 1.8 THz and 2.5 THz region are lack of features, owing to scattering induced by the real cereals particles. The Fig. 3(A) is the absorption spectrum of yellow foxtail millet powder sample and no obvious characteristic bands can be found except a slope background. Fig. 3(B) gives the absorption spectra of the mixtures of L-Glutamic acid, L-Glutamine and L-Tyrosine and PE. Fig. 3(C) are the spectra of pellets pressed with preconcerted content of the three amino acids in the yellow foxtail millet. From Fig. 3(B) and (C), most of the characteristic bands corresponding to the three pure

3.2. Quantitative analysis Prior to the calibration model building, MSC, S-G smoothing and first derivative and wavelet transform are applied to improve the signalnoise ratio and enhance the stability of the prediction models. The performance of preprocessing methods is evaluated by the pooled prediction rates of BPLs with the PLS model. The results of the average prediction were calculated across the RMSEP and correlation coefficient (R2) (See Table 1). The average prediction results were calculated across the 100 bootstraps to provide 95% confidence intervals. Partial least squares (PLS) and support vector machine (SVM) models with different pretreatment methods were built to quantify the amino acids in yellow foxtail millet samples, performed in the range of the 0.3 to 1.8 THz absorption coefficients. Table 1 gives a comparison of RMSEP (RMSE) and R2 of different calibration methods obtained by PLS and SVM for calibration and external test dataset. Results obtained by PLS and SVM models give different RMSEP and R2 when different preprocessing were applied. For the PLS model, lower RMSEP and

Fig. 3. (A) THz spectra of foxtail millet powder. (B) THz spectra of ternary amino acids mixtures at different concentrations in the matrix of polyethylene. (C) THz spectra of ternary amino acids mixtures at different concentrations in the matrix of foxtail millet.

12

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15

X. Zhang et al.

Table 1 The comparison of results obtained by PLS and SVM for calibration and external test dataset with and without using preprocessing pretreatment. Models

Preprocessing methods

Cross validation Gln

PLS

SVM

None MSC SG smoothing Wavelet transform None MSC SG smoothing Wavelet transform

External test Glu

Tyr

Gln

Glu

Tyr

RMSECV (%)

R2

RMSECV (%)

R2

RMSECV (%)

R2

RMSEP (%)

R2

RMSEP (%)

R2

RMSEP (%)

R2

1.0471 1.0193 1.0426 1.0427 0.9250 0.7303 0.8648 3.5645

0.8152 0.8232 0.8167 0.8157 0.8287 0.8618 0.8591 0.7253

0.5673 0.6019 0.5986 0.5785 1.6355 0.6592 1.1890 0.9025

0.9477 0.9405 0.9415 0.9456 0.9093 0.9087 0.9244 0.9365

0.4977 0.4469 0.4774 0.4645 1.4439 0.4216 0.5320 2.1454

0.9584 0.9669 0.9618 0.9639 0.9003 0.9732 0.9472 0.7533

0.8274 0.9792 0.8699 0.7656 0.7392 0.9018 0.7463 0.9110

0.8573 0.8669 0.8507 0.8725 0.8635 0.8295 0.8609 0.7928

0.6532 0.5174 0.5440 0.5729 0.5081 0.5053 0.5137 0.9987

0.9539 0.9697 0.9698 0.9686 0.9637 0.9654 0.9629 0.8599

0.5273 0.3076 0.4487 0.4594 0.7084 0.3726 0.6737 1.3714

0.9798 0.9790 0.9837 0.9810 0.9289 0.9690 0.9357 0.7335

higher R2 were obtained with preprocessing comparing those without preprocessing. Denoising and MSC improved the SVM model. However, when wavelet transform was applied, the prediction accuracy of SVM decreased very fast, because of the parameters, c and g are not suitable to the processed data any more. The best predictions were bold in Table 1 for different pretreatments on each amino acid. SVM model with MSC provide the most stable predictions for both validation set and external test set. Fig. 4 displayed the correlation between the true concentration and the predicted concentrations by PLS model of the three amino acids, demonstrating that the PLS model achieves satisfactory analysis precision.

ALS from data of samples in the matrix of foxtail millet. ALS was initialed with the purest THz spectra of the raw data and Non-negativity constraint was applied on both spectra and concentration directions during the iterations. MCR results provided good explanations of the raw data that fitting error (lack of fit, lof) is 2.468% and variance explained (R2) at the optimum is 99.94%. The agreement between the reference spectra of pure analytes measured by THz spectroscopy and those spectra of each components resolved by MCR were evaluated by correlation coefficients (r2) and angles, which for glutamine were 0.9993 and 2.2078 degree, for glutamic acid were 0.9997 and 1.3454 degree and for tyrosine were 0.9994 and 2.0564 degree, respectively. The results show that the spectra of the pure analytes obtained from the MCR are corresponding to glutamine, glutamic acid, and tyrosine (Fig. 5). MCR-ALS can also provide the relative concentrations corresponding to each components, and the correlation coefficient between the resolved concentrations and the true relative concentrations are 0.8036, 0.8921, and 0.9542 for Gln, Glu, and Tyr. Even though these results are not as good as those SVM provided, MCR-ALS can provide concentration profiles for each component, which are helpful in many cases. The results provided an evidence that the ternary mixture system have some nonlinear features under the analysis of THz-TDS.

3.3. MCR results for the two amino acids mixture samples Based on the results obtained by singular value decomposition (SVD), three components were used for MCR. The numbers of four and five components were also tested during our experiment but they did not explain much more information and may make the model over fitting. Fig. 5 shows the pure spectra for each component resolved by MCR-

Fig. 4. Correlation coefficient between predicted content of the SVM with MSC preprocessing versus the actual content of the (A) glutamine (B) glutamic acid (C) tyrosine in foxtail millet. (The error bars represent the 95% confidence intervals over 100 bootstraps Latin Partitions.)

13

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15

X. Zhang et al.

References [1] J.B. Baxter, G.W. Guglietta, Terahertz spectroscopy, Anal. Chem. 83 (2011) 4342–4368. [2] B. Ferguson, X. Zhang, Materials for terahertz science and technology, Nat. Mater. 1 (2002) 26–33. [3] J. El Haddad, F. de Miollis, J. Bou Sleiman, L. Canioni, P. Mounaix, B. Bousquet, Chemometrics applied to quantitative analysis of ternary mixtures by terahertz spectroscopy, Anal. Chem. 86 (2014) 4927–4933. [4] Y. Ma, Q. Wang, L. Li, PLS model investigation of thiabendazole based on THz spectrum, J. Quant. Spectrosc. Radiat. Transf. 117 (2013) 7–14. [5] A. Fitzgerald, E. Berry, N. Zinovev, G. Walker, M. Smith, J. Chamberlain, An introduction to medical imaging with coherent terahertz frequency radiation, Phys. Med. Biol. 47 (2002) R67. [6] A. Markelz, A. Roitberg, E. Heilweil, Pulsed terahertz spectroscopy of DNA, bovine serum albumin and collagen between 0.1 and 2.0 THz, Chem. Phys. Lett. 320 (2000) 42–48. [7] Y. Ueno, R. Rungsawang, I. Tomita, K. Ajito, Quantitative measurements of amino acids by terahertz time-domain transmission spectroscopy, Anal. Chem. 78 (2006) 5424–5428. [8] A. Markelz, S. Whitmire, J. Hillebrecht, R. Birge, THz time domain spectroscopy of biomolecular conformational modes, Phys. Med. Biol. 47 (2002) 3797–3805. [9] H. Murakami, Protein and water confined in nanometer-scale reverse micelles studied by near infrared, terahertz, and ultrafast visible spectroscopies, Biomol. Spectrosc.: Adv. Integr. Exp. Theory 93 (2013) 183–211. [10] T. Shibata, T. Mori, S. Kojima, Low-frequency vibrational properties of crystalline and glassy indomethacin probed by terahertz time-domain spectroscopy and lowfrequency Raman scattering, Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 150 (2015) 207–211. [11] Z. Chen, Z. Zhang, R. Zhu, Y. Xiang, Y. Yang, Pd.B. Harrington, Application of terahertz time-domain spectroscopy combined with chemometrics to quantitative analysis of imidacloprid in rice samples, J. Quant. Spectrosc. Radiat. Transf. 167 (2015) 1–9. [12] Q. Wang, Y. Ma, Qualitative and quantitative identification of nitrofen in terahertz region, Chemom. Intell. Lab. Syst. 127 (2013) 43–48. [13] N. Qi, Z. Zhang, Y. Xiang, Y. Yang, X. Liang, P.B. Harrington, Terahertz timedomain spectroscopy combined with support vector machines and partial least squares-discriminant analysis applied for the diagnosis of cervical carcinoma, Anal. Methods 7 (2015) 2333–2338. [14] N. Qi, Z. Zhang, Y. Xiang, Y. Yang, P.B. Harrington, Terahertz time-domain spectroscopy combined with fuzzy rule-building expert system and fuzzy optimal associative memory applied to diagnosis of cervical carcinoma, Med. Oncol. 32 (2015) 383. [15] A. Culka, J. Jehlicka, H.G. Edwards, Acquisition of Raman spectra of amino acids using portable instruments: outdoor measurements and comparison, Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 77 (2010) 978–983. [16] A.L. Jenkins, R.A. Larsen, T.B. Williams, Characterization of amino acids using Raman spectroscopy, Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 61 (2005) 1585–1594. [17] G. Zhu, X. Zhu, Q. Fan, X. Wan, Raman spectra of amino acids and their aqueous solutions, Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 78 (2011) 1187–1195. [18] A. Abina, U. Puc, A. Jeglic, J. Prah, R. Venckevicius, I. Kasalynas, G. Valusis, A. Zidansek, Qualitative and quantitative analysis of calcium-based microfillers using terahertz spectroscopy and imaging, Talanta 143 (2015) 169–177. [19] T. Gaillard, A. Trivella, R.H. Stote, P. Hellwig, Far infrared spectra of solid state l-serine, l-threonine, l-cysteine, and l-methionine in different protonation states, Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 150 (2015) 301–307. [20] A. Matei, N. Drichko, B. Gompf, M. Dressel, Far-infrared spectra of amino acids, Chem. Phys. 316 (2005) 61–71. [21] U. Antony, G. Sripriya, T. Chandra, The effect of fermentation on the primary nutrients in foxtail millet (Setaria italica), Food Chem. 56 (1996) 381–384. [22] P. Geervani, B.O. Eggum, Nutrient composition and protein quality of minor millets, Plant Foods Hum. Nutr. 39 (1989) 201–208. [23] D.C. Hufnagle, A.R. Taulbee-Combs, W.U. Spendel, G.E. Pacey, Collective mode frequency shifts in L-serine and a series of isotopologues in the terahertz regime, Talanta 88 (2012) 61–65. [24] T.M. Korter, R. Balu, M.B. Campbell, M.C. Beard, S.K. Gregurick, E.J. Heilweil, Terahertz spectroscopy of solid serine and cysteine, Chem. Phys. Lett. 418 (2006) 65–70. [25] W. Wang, H. Li, Y. Zhang, C. Zhang, Correlations between terahertz spectra and molecular structures of 20 standard α-amino acids, Acta Phys.-Chim. Sin. 25 (2009) 2074–2079. [26] J.R. Wang, Z.Y. Zhang, Z.W. Zhang, Y.H. Xiang, Pd.B. Harrington, THz-TDS combined with a fuzzy rule-building expert system applied to the identification of official rhubarb samples, Anal. Methods 6 (2014) 7695–7702. [27] S. Yoshida, A. Kaibara, N. Ishibashi, K. Shirouzu, Glutamine supplementation in cancer patients, Nutrition 17 (2001) 766–768. [28] P.E. May, A. Barber, J.T. D’Olimpio, A. Hourihane, N.N. Abumrad, Reversal of cancer-related wasting using oral supplementation with a combination of βhydroxy-β-methylbutyrate, arginine, and glutamine, Am. J. Surg. 183 (2002) 471–479. [29] C. Decker-Baumann, K. Buhl, S. Frohmüller, A. Herbay, M. Dueck, P. Schlag, Reduction of chemotherapy-induced side-effects by parenteral glutamine supplementation in patients with metastatic colorectal cancer, Eur. J. Cancer 35 (1999) 202–207.

Fig. 5. The comparison of the reference spectrum of glutamine (A) glutamic acid (B) and tyrosine (C) measured by THz (dotted lines) and their pure spectra resolved by MCR (solid lines) from THz data of the samples.

4. Conclusions The proposed THz-TDS analysis method is successful for quantitative analysis of ternary amino acids with very similar chemical structure in the yellow foxtail millet powder samples. The THz absorption spectra of glutamine, glutamic acids, and tyrosine references showed different characteristic absorption bands between 0.3 to 1.8 THz in THz spectra and they still can be recognized when they were mixed with yellow foxtail millet powder based on their THz spectral peaks and absorption strength. By comparing different data preprocessing methods and regression models, SVM with MSC preprocessing can be considered as the best model for quantitative analysis of the ternary amino acid in foxtail millet using THz-TDS. MCR-ALS resolved the pure spectra and concentration profiles of the amino acid mixture in the matrix of yellow foxtail millet analyzed by THz spectroscopy. The THz spectra profiles of the three amino acids in cereals are in consistency well with the experimental spectra of pure amino acids, so that MCR-ALS displayed potential to be applied for resolution of THz spectra. Results showed that the THz-TDS can be utilized to the qualitative and quantitative analysis of amino acids in the yellow foxtail millet and the proposed method can be extended to analyze other nutrition components in different cereals.

Acknowledgements This work was supported by the National Instrumentation Program (2012YQ140005) and the National Natural Science Foundation of China (21275101).

14

Chemometrics and Intelligent Laboratory Systems 164 (2017) 8–15

X. Zhang et al.

[44] T.D. Dorney, R.G. Baraniuk, D.M. Mittleman, Material parameter estimation with terahertz time-domain spectroscopy, J. Opt. Soc. Am. A: Opt. Image Sci. Vision 18 (2001) 1562–1571. [45] L. Duvillaret, F. Garet, J.-L. Coutaz, Highly precise determination of optical constants and sample thickness in terahertz time-domain spectroscopy, Appl. Opt. 38 (1999) 409–415. [46] I. Daubechies, The wavelet transform, time-frequency localization and signal analysis, IEEE Trans. Inf. theory 36 (1990) 961–1005. [47] D.L. Donoho, J.M. Johnstone, Ideal spatial adaptation by wavelet shrinkage, Biometrika 81 (1994) 425–455. [48] L. Qiu, M.H. Er, Wavelet spectrogram of noisy signals, Int. J. Electron. 79 (1995) 665–677. [49] H. Wold, Partial least squares, Encycl. Stat. Sci. (1985) 10. [50] R. Nishikiori, M. Yamaguchi, K. Takano, T. Enatsu, M. Tani, U.C. de Silva, N. Kawashita, T. Takagi, S. Morimoto, M. Hangyo, Application of partial least square on quantitative analysis of L-, D-, and DL-tartaric acid by terahertz absorption spectra, Chem. Pharm. Bull. 56 (2008) 305–307. [51] C.-C. Chang, C.-J. Lin, LIBSVM: A library for support vector machine, 2001, Software available at 〈http://www.csie.ntu.edu.tw/~cjlin/libsvm〉, 2012. [52] S. Navea, A.d. Juan, R. Tauler, Detection and resolution of intermediate species in protein folding processes using fluorescence and circular dichroism spectroscopies and multivariate curve resolution, Anal. Chem. 74 (2002) 6031–6039. [53] E. Pere-Trepat, M. Petrovic, D. Barcelo, R. Tauler, Application of chemometric methods to the investigation of main microcontaminant sources of endocrine disruptors in coastal and harbour waters and sediments, Anal. Bioanal. Chem. 378 (2004) 642–654. [54] Pd.B. Harrington, Statistical validation of classification and calibration models using bootstrapped Latin partitions, TrAC Trends Anal. Chem. 25 (2006) 1112–1124. [55] R.W. Kennard, L.A. Stone, Computer aided design of experiments, Technometrics 11 (1969) 137–148. [56] C.-C. Chang, C.-J. Lin, LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol. (TIST) 2 (2011) 27. [57] P. de Boves Harrington, Statistical validation of classification and calibration models using bootstrapped Latin partitions, TrAC Trends Anal. Chem. 25 (2006) 1112–1124.

[30] S.L. Hazen, F.F. Hsu, D.M. Mueller, J.R. Crowley, J.W. Heinecke, Human neutrophils employ chlorine gas as an oxidant during phagocytosis, J. Clin. Investig. 98 (1996) 1283. [31] F.W. Sayre, E.L. Hansen, E.A. Yarwood, Biochemical aspects of the nutrition of Caenorhabditis briggsae, Exp. Parasitol. 13 (1963) 98–107. [32] S. Lu, X. Zhang, Z. Zhang, Y. Yang, Y. Xiang, Quantitative measurements of binary amino acids mixtures in yellow foxtail millet by terahertz time domain spectroscopy, Food Chem. 211 (2016) 494–501. [33] Y. Chen, Y. Ma, Z. Lu, B. Peng, Q. Chen, Quantitative analysis of terahertz spectra for illicit drugs using adaptive-range micro-genetic algorithm, J. Appl. Phys. 110 (2011) 044902. [34] Y. Hua, H. Zhang, H. Zhou, Quantitative determination of cyfluthrin in n-hexane by terahertz time-domain spectroscopy with chemometrics methods, IEEE Trans. Instrum. Meas. 59 (2010) 1414–1423. [35] Y. Li, J. Li, Z. Zeng, J. Li, Z. Tian, W. Wang, Terahertz spectroscopy for quantifying refined oil mixtures, Appl. Opt. 51 (2012) 5885–5889. [36] R. Nishikiori, M. Yamaguchi, K. Takano, T. Enatsu, M. Tani, U.C. Silva, N. Kawashita, T. Takagi, S. Morimoto, M. Hangyo, Application of partial least square on quantitative analysis of L-, D-, and DL-tartaric acid by terahertz absorption spectra, Chem. Pharm. Bull. 56 (2008) 305–307. [37] I. Pérez, M.A. Culzoni, G. Siano, M.A. García, H.C. Goicoechea, M.A. Galera, Detection of unintended stress effects based on a metabonomic study in tomato fruits after treatment with carbofuran pesticide. Capabilities of MCR-ALS applied to LC-MS three-way data arrays, Anal. Chem. 81 (2009) 8335–8346. [38] P. Geladi, D. MacDougall, H. Martens, Linearization and scatter-correction for nearinfrared reflectance spectra of meat, Appl. Spectrosc. 39 (1985) 491–500. [39] K.R. Beebe, R.J. Pell, M.B. Seasholtz, Chemometrics: A Practical Guide, WileyInterscience, New York, 1998. [40] A. Savitzky, M. Golay, Smoothing and differentiation of data by simplified least squares procedures, Anal. Chem. 36 (1964) 1627–1639. [41] J. Steinier, Y. Termonia, J. Deltour, Smoothing and differentiation of data by simplified least square procedure, Anal. Chem. 44 (1972) 1906–1909. [42] H. Madden, Comments on smoothing and differentiation of data by simplified least square procedure, Anal. Chem. 50 (1978) 1383–1386. [43] A. Calderón, Intermediate spaces and interpolation, the complex method, Stud. Math. 24 (1964) 113–190.

15