Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra

Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83 Contents lists available at ScienceDirect Spectrochimica Acta P...

2MB Sizes 0 Downloads 68 Views

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83

Contents lists available at ScienceDirect

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy journal homepage: www.elsevier.com/locate/saa

Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra Hao Zhan 1, Jing Fang 1, Liying Tang, Hongjun Yang, Hua Li, Zhuju Wang, Bin Yang, Hongwei Wu ⁎, Meihong Fu ⁎ Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Dong Nei Nan Xiao Jie 16, Beijing 100700, China

a r t i c l e

i n f o

Article history: Received 11 August 2016 Received in revised form 16 April 2017 Accepted 18 April 2017 Available online 20 April 2017 Keywords: NIR HPLC PLS Radix Paeoniae Rubra Quality assessment

a b s t r a c t Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky–Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra. © 2017 Published by Elsevier B.V.

1. Introduction Radix Paeoniae Rubra (RPR), named as Chi Shao in Chinese, is derived from the root of plant Paeonia lactiflora Pall or Paeonia veitchii Lynch (Family Ranunculaceae) [1]. In traditional Chinese medicine (TCM), RPR is widely used to reduce fever, cool blood, eliminate blood stasis, activate blood circulation, and relieve pain [1,2]. Modern phytochemical and pharmacological studies have shown that the major bioactive components of RPR are monoterpene glycosides, triterpenoid saponins, tannins, polysaccharides, and aromatic acids [3–7]. Various biological activities of the compounds or extracts from RPR such as antioxidant [8,9], anticonvulsant [10], antithrombotic, free radical scavenging [11], and antihyperlipidemic [12] activities have been reported. Several analytical techniques such as high-performance liquid chromatography coupled with photodiode array detector (HPLC-PDA) [13–15], gas chromatography–mass spectrometry [16], and liquid chromatography coupled with mass spectrometry [17,18] have been reported for the qualitative or quantitative analysis of the chemical constituents in RPR. The chromatographic analysis methods have advantages of accuracy, specificity, and reproducibility. However, all of ⁎ Corresponding authors. E-mail addresses: [email protected] (H. Wu), [email protected] (M. Fu). 1 These authors contributed equally to this work.

http://dx.doi.org/10.1016/j.saa.2017.04.034 1386-1425/© 2017 Published by Elsevier B.V.

them are destructive, waste solvents, and require time-consuming extraction and measurement steps [19,20]. Therefore, it is necessary to develop a fast and effective method for the rapid quality assessment of RPR. Near-infrared (NIR) reflectance spectroscopy is a rapid, accurate, and nondestructive technique [21]; this has been proved to be a powerful analytical tool in the agricultural [22,23], food [24,25], petrochemical [26], and pharmaceutical industries [27] for quality analysis. In all these studies, NIR spectral data calibrations are often made with classical multivariate methods, for example, partial least squares (PLS) and principal component regression (PCR). Many spectral pretreatment methods have been developed to reduce the effects of variations in the spectral data unrelated to the chemical variations in the samples [19,28]. Therefore, the regression and pretreatment methods are very important to generate a stable model with superior interpretability and high prediction ability. Coupled with proper multivariate analysis, NIR can be used to replace the time-consuming chemical methods [29]. In this study, NIR was used as a rapid and nondestructive analytical tool to determine the contents of gallic acid, catechin, albiflorin, and paeoniflorin in RPR. Gallic acid, catechin, albiflorin, and paeoniflorin, particularly paeoniflorin—the only marker of quality evaluation in Chinese Pharmacopoeia—are the main bioactive components in RPR. To the best of our knowledge, few studies have been reported for the simultaneous determination of these four compounds by NIR. Although

76

H. Zhan et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83

the quality control of RPR with NIR has been studied, the quality control markers were different and the number was small [30]. In this study, more samples (101) from five different typical cultivated areas were collected to develop NIR models. Different regression methods were compared by PLS, PCR, and stepwise multivariate linear regression (SMLR) for NIR quantitative analysis [31]. Various pretreatment methods such as derivatives (1st and 2nd), multiplicative scatter correction (MSC), standard normal variate (SNV), Savitzky–Golay filter, and Norris derivative filter were used to obtain the optimized models for gallic acid, catechin, albiflorin, and paeoniflorin, respectively. The RPR grown in China is widely distributed. The prices of RPR obtained from different areas vary in the market [32]. There are fewer reports about quality assessment for the comparison of RPR obtained from different areas using NIR. Therefore, this study also investigated the feasibility of NIR spectroscopy, combined with principle component analysis (PCA) to classify the original areas of RPR samples.

2. Materials and methods 2.1. Samples and reagents One hundred and one batches of RPR samples originating from different areas in China were collected for this study. The origin information of all the samples is summarized in Table 1. Prior to the HPLC and NIR analysis, the samples were crushed into powder using a FW100 high-speed grinder and passed through a 100-mesh sieve. Peoniflorin, catechin, and gallic acid were purchased from the National Institute for Food and Drug Control (Beijing, China). Albiflorin was obtained from Nanjing Zelang Pharmaceutical Technology (Jiangsu, China). The purity of the compounds was assessed to be N 98% by reverse-phase HPLC. The chemical structures of the compounds are shown in Fig. 1. Methanol was of HPLC grade and obtained from Merck (Darmstadt, Germany). Distilled water was further purified using a Milli-Q system (Millipore, USA). Phosphoric acid was purchased from Chemical Company of Beijing (China), and the other chemicals were of analytical grade.

2.2. Sample preparation and standard preparation for HPLC measurements The powder sample was accurately weighed (0.5 g) and added to a flat-bottomed flask containing 25 mL of a water/MeOH mixture (30:70, v/v). The mixture was weighed and extracted using ultrasonication for 1 h. After the extract was cooled, the weight was complemented with methanol. Then, the extracts were directly filtered through a 0.45-μm nylon filter into the HPLC vials for immediate analysis. A stock solution of the mixture of four reference compounds was prepared by dissolving accurately weighed amounts of the standards with methanol in a 10 mL volumetric flask, followed by adding methanol to make up the volume. The concentrations of gallic acid, catechin, albiflorin, and paeoniflorin were 0.472, 0.604, 0.598, and 1.604 mg/mL, respectively. The stock solution was further diluted to a series of concentrations. Calibration curves were prepared for at least six different concentration levels. Table 1 The origin information of Radix paeoniae rubra for all samples. Original province

number

Sichuan Neimeng Jilin Heilongjiang Guangxi

41 20 10 10 20

2.3. HPLC-PDA The HPLC analysis was performed using a LC-20AT system coupled with a PDA detector (Shimadzu, Japan). Separation was achieved using a Diamonsil C18 (2) column (5 μm, 250 × 4.6 mm, Dikma Technologies, China), and the temperature was maintained at 30 °C. The mobile phase consisted of 0.1% phosphoric acid aqueous solution (A) and methanol (B) at a flow rate of 1.0 mL/min. Analysis was performed using the following gradient elution: 0 min, 3% B; 10 min, 3% B; 10.01 min, 20% B; 20 min, 20% B; 20.01 min, 25% B; 30 min, 25% B; 30.01 min, 30% B; 40 min, 30% B; 40.01 min, 50% B and 60 min 50% B. The system was re-equilibrated for 10 min with 3% B. The detection wavelength was 230 nm for catechin, paeoniflorin, albiflorin, and it was 275 nm for gallic acid. The injection volume was 10 μL. The content of each compound was calculated by comparing the peak area with the standard curve. Method validation was conducted under the abovementioned conditions. The validation included calibration curves, precision, stability, repeatability, and accuracy. The intra and interday precisions were determined by analyzing the standard solution containing the four marker compounds, with three repetitions daily over three consecutive days. The stability was tested with one prepared sample at room temperature and analyzed at 0 h, 2 h, 4 h, 8 h, 12 h, and 24 h. To confirm the repeatability, six different working solutions prepared from the same sample were analyzed. The RSD was taken as the measure of precision, stability, and repeatability. The accuracy of the method was estimated from recovery experiments. The recovery was determined by spiking a selected sample. First, the contents of the four analytes in the sample were calculated from the corresponding calibration curves before spiking six sample aliquots with identical amounts of the reference compound mixture. Then, the fortified samples were extracted and analyzed as described above. The average recoveries were estimated using the formula: recovery (%) = (amount found − original amount) × 100%/amount spiked. 2.4. NIR spectra collection and software The spectra of the prepared powder samples were collected using an integrating sphere diffuse reflectance NIR spectrometer (Antaris Fourier Transform-NIR spectrometer, Thermo Scientific, USA) in the range 10,000–4000 cm−1. With a built-in background as the reference, spectral scanning was carried out for 64 times at a resolution of 16. To reduce the measuring error, each sample was measured twice. The average of two spectra was used for further analysis. The temperature was maintained at ~25 °C, and the humidity was maintained at a steady level in the laboratory. Spectral data preprocessing, achieving different algorithms for constructing regression models, and classifying the sample spectra were all carried out using the TQ analyst software package (Thermo scientific, USA). 2.5. Multivariate data analysis The TQ analyst software provides a broad selection of quantitative analyses. In this study, three regression methods PLS, PCR, and SMLR for calculating the concentrations of components were compared based on the raw NIR data. PLS, PCR and SMLR are all linear regression methods widely used in NIR agrofood and TCM applications. In terms of principle, PLS and PCR are similar. However, the PCR algorithm frequently requires one or two more factors to describe the variation in the spectral data than the PLS algorithms; therefore, it may take longer to calibrate a PCR method. The PLS method is capable of quantifying sample components when the correlation between concentration and absorbance is very complex, for example, when chemical interactions between the components shift or broaden the peaks in the mixture spectrum. In addition, SMLR generally works well when there is little or no overlap among component peaks, and the components of interest absorb linearly with concentration.

H. Zhan et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83

77

Fig. 1. Chemical structures of gallic acid (a), catechin (b), albiflorin (c), and peoniflorin (d).

The best regression method is decided according to the correlation coefficient (r), root mean square error of prediction (RMSEP), root mean square of calibration (RMSEC), and root mean square error of cross-validation (RMSECV). Based on the best regression method, several spectral preprocessing methods such as derivatives (1st and 2nd), smoothening (Savitzky–Golay and Norris derivative filters), MSC and SNV, and a combination of multiple preprocessing methods have also been applied to obtain the optimal model. Among the pretreatments, SNV is a mathematical transformation method of the log spectra and is used to remove the variation in slope and correct the scattering effects. MSC is another important procedure for the correction of scattered light based on different particle sizes. Savitzky–Golay and Norris derivative filters were used for smoothening the data, i.e., to increase the signal-to-noise ratio without significantly distorting the signal. Using the TQ analyst software, a classification study was carried out to investigate the possibility of discriminating the original areas of RPR samples with principal component analysis (PCA). PCA is an unsupervised multivariate method of data analysis for data display and pattern recognition [33]. This method makes no assumptions about the underlying statistical data distribution. The original dataset is converted into the same number of orthogonal factors as the original variables, and these are referred as the principal components (PCs). Each sample object has a score value on each PC, and thus a score scatter plot for the first two or three PCs (PC1, PC2, and PC3) is often used to represent the characteristics of the samples and provides a comparative display of the samples [34]. 3. Results and discussion 3.1. HPLC-PDA All the four analytes were successfully separated using the developed HPLC method as shown in Fig. 2. The presence of each of the four compounds in the samples was confirmed from the retention time and UV spectra. The equations, linear ranges, retention times, and detection wavelengths for the four compounds are summarized in Table 2. The precision, stability, repeatability, and recovery of the method are demonstrated in Table 3. All the calibration curves showed a good linear regression (r2 N 0.999) within the test ranges. The test results of precision showed that the RSD of the overall intra and interday variations are b 5% for all the four analytes. Besides, the validation studies of this method proved that the assay has a good reproducibility with

RSD (also b5%). The developed analytical method has a good accuracy with the overall recovery from 94.9% to 100.1% for the concerned analytes. Therefore, the results indicate that the HPLC method is sensitive, specific, accurate, and stable. The concentrations of gallic acid, catechin, albiflorin, and paeoniflorin in RPR were measured accurately by the method described above. The results of gallic acid, albiflorin, paeoniflorin, and paeonol are listed in Table 4. NIR spectroscopy has a poor sensitivity. In this study, except for a few special samples, the contents of the detected compounds in most of the samples are N0.1%, and the average values of the four detected compounds are all above 0.2%, thus suitable to apply NIR technique for calculating the contents of compounds in RPR. 3.2. Spectral feature of NIR Fig. 3 shows an overlapping raw NIR spectrum of RPR originating from different regions. NIR spectroscopy shows the absorption of overtone and combination bands containing hydrogen-bonding groups such as C\\H, O\\H, and N\\H. The four analytes gallic acid, catechin, albiflorin, and paeoniflorin contain C\\H and O\\H groups. Therefore, the absorption characteristics of the spectrum are as follows: 4005 cm−1, 4323 cm−1 (the vibration of methyl and methylene C\\H bonds, respectively), 4766 cm−1 (first overtone of O\\H deformation and C\\H stretching), 5175 cm−1 (O\\H stretching and deformation vibration), 5759 cm−1 (first overtone of C\\H stretching of methyl, methylene), and 6844 cm−1 (first overtone of O\\H stretching) [21,35]. Fig. 3 shows that the curve is smooth, and no clear changes in the absorbances were observed in the range 10,000–8000 cm−1. Absorbance information was very abundant in the range 8000–4000 cm−1. Therefore, to save the analysis time, suitable analysis bands of RPR were selected using the program TQ 9.0 in the range 8000–4000 cm−1. 3.3. Comparison of different regression methods According to the content of the compound for developing the regression model, 101 collected samples were randomly divided into two subsets. One of the subsets was calibration set that was used to build the model, and the other was the validation set that was used to test the robustness of the model. The rate of calibration/validation spectra was about 3:1. The calibration set consisted of 75 spectra, whereas the validation set consisted of 26 spectra. The range of y-value (the content of the compound) in the calibration set covered the range in the validation

78

H. Zhan et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83

Fig. 2. HPLC chromatograms of Radix Paeoniae Rubra. A: The mixture of the four standards at 275 nm; B: The mixture of the four standards at 230 nm; C: The sample of Radix Paeoniae Rubra at 275 nm; D: The sample of Radix Paeoniae Rubra at 230 nm. Gallic acid (a), catechin (b), albiflorin (c), and peoniflorin (d); Gallic acid (a) is detected at 275 nm; Catechin (b), albiflorin (c), and peoniflorin (d) are detected at 230 nm,

set; therefore, the distribution of the samples was appropriate in the calibration and validation sets. The quality of the model was evaluated by r, RMSEC, and RMSEP. According to leave-one-out cross validation, the RMSECV determined the optimal number of main factors. RMSEC value was calculated using formula (1) [33]. Term “A” refers to the known or actual concentration of selected standards. “C” refers to a value computed by the model using the spectral data. M is the number of samples used in the calibration sets, and f is the number of factors used in the calibration model. The low value of RMSEC indicates that the regression model had a good performance. To determine the closeness between the reference value (value obtained through HPLC analysis) and the value determined

by the calibration model, RMSEP was calculated using formula (2), where N is the number of samples used in the validation sets. For RMSECV, a leave-one-sample-out cross-validation was performed. The RMSECV was calculated as follows (Eq. (3)) [36], where n is the number of samples in the calibration set, yj is the reference measurement result for sample j, and y/j is the estimated result for sample j when the model is constructed by removing sample j. vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u u m u∑ ðA−C Þ2 t j¼1 ; RMSEC ¼ M− f −1

ð1Þ

Table 2 Calibration data for the detected compounds of Radix paeoniae rubra. No 1 2 3 4

RT (min) 10.32 22.30 28.65 34.27

Compound name Gallic acid Catechin Albiflorin Peoniflorin

Detection wavelength 275 nm 230 nm 230 nm 230 nm

Regression equation 7

y = 3 × 10 x-7975.3 y = 7 × 106x-1176.2 y = 6 × 106x + 626.36 y = 1 × 107x-40,212

R2

Linear range (μg/mL)

0.9999 0.9999 0.9998 0.9994

14.8–472.0 18.8–604.0 18.7–598.0 50.1–1604.0

H. Zhan et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83 Table 3 HPLC method validation of precision, stability, repeatability and accuracy. No Compound Precision (RSD %) name

1 2 3 4

Gallic acid Catechin Albiflorin Peoniflorin

Intra-day (n = 3)

Inter-Day (n = 9)

0.17 0.17 1.02 0.10

0.23 0.56 2.31 1.01

Stability Repeatability Recovery (RSD %) (RSD %) (n = 6)

1.79 1.74 2.45 2.03

1.44 1.27 1.68 1.18

Mean

RSD%

98.99 100.09 96.36 94.87

2.25 2.08 2.09 1.23

Table 4 The contents of the four detected compounds in the Radix Paeoniae Rubra samples. Compound Name

Maximum (%)

Minimum (%)

Average values (%)

Gallic acid Catechin Albiflorin Peoniflorin

1.002 4.064 1.191 5.015

0.043 0.091 0.045 1.761

0.256 1.011 0.533 2.976

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n u∑ ðA−C Þ2 t j¼1 RMSEP ¼ ; N−1 vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n u∑ ðy= j−yjÞ2 t j¼1 RMSECV ¼ n

ð2Þ

ð3Þ

Generally, good models should have lower RMSEC, RMSECV, and RMSEP and higher r, but smaller differences between RMSECV and RMSEP. RMSEP is generally not higher than 1.2 times the RMSEC [37]. Three different types of regression algorithms including SMLR, PCR, and PLS were used to establish the models for the quantitative analysis of gallic acid, catechin, albiflorin, and paeoniflorin in RPR based on the raw spectral spectra. As shown in Table 5, the best regression model

79

of the three algorithms for the four compounds was similar. PLS with a lower RMSEC, RMSECV, and RMSEP and a higher r showed the best performance in both the fitting and predicting results for all the four analytes. Therefore, the PLS method was selected to establish calibration models and further optimization. 3.4. PLS model The data acquired from the raw NIR spectrometer contain background information and noises besides the sample information. To obtain the optimized PLS calibration models, it is necessary to preprocess the spectral data before the modeling. In this study, different spectral pretreated methods of derivatives (1st and 2nd), smoothening (Savitzky–Golay and Norris derivative filters), MSC, and SNV were selected and compared for each PLS calibration model. Derivatives can resolve the overlapping peaks, thus improving the resolution and sensitivity. Specially, the 1st derivative was used to remove the additive baseline, and the 2nd derivative was applied to remove the effects of sloped additive baselines. As the random noise of various origins is unavoidable in the actual spectra, smoothening can be used to reduce the random noise and thus enhance the signal-to-noise ratio of the spectra. A decrease in random noise can decrease the error in the calibration model. MSC treatment was used to compensate the variations in sample thickness caused by particle size and scattering. Similar to MSC, SNV was also used to eliminate the scattering effects of the spectrum. The parameters of the PLS regression models with different spectral pretreatment methods for gallic acid, catechin, albiflorin, and paeoniflorin are shown in Table 6. The results show that the optimized models with a higher r and lower values of RMSEC and RMSEP may reliably predict the contents of the four analytes in RPR. The number of PLS factors for each model was selected based on the leave-one-out cross-validation, in which the RMSECV was used to determine the optimal model without “overfittedness” or “underfittedness” [38]. In fact, RMSEP and RMSECV were also close to RMSEC for different models. In a close case, a high correlation coefficient of RMSECV was selected in the optimization model. Fig. 4 shows the relevance plots between the reference and PLS prediction values for the four compounds, and the prediction accuracy is

Fig. 3. Overlapping raw NIR spectrograms of Radix Paeoniae Rubra originated from different regions. The scanning range of is 10,000–4000 cm−1. 4005 cm−1, 4323 cm−1 (the vibration of \H deformation and C\ \H stretching), 5175 cm−1 (O\ \H stretching and deformation vibration), methyl and methylene C\ \H bonds, respectively), 4766 cm−1 (first overtone of O\ \H stretching of methyl, methylene), and 6844 cm−1 (first overtone of O\ \H stretching). 5759 cm−1 (first overtone of C\

80

H. Zhan et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83

Table 5 Comparation of regression models by PLS, PCR and SMLR NIR based on raw NIR spectroscopy. Compound Regression methods

Calibration

Validation

Cross-validation

RMSEC r

RMSEP r

RMSECV r

gallic acid

0.125 0.126 0.166 0.5 0.82 1.12 0.245 0.277 0.334 0.23 0.768 0.47

0.0908 0.0956 0.135 0.473 0.721 1.02 0.202 0.249 0.312 0.256 0.621 0.505

0.156 0.158 0.175 0.631 1.01 1.17 0.321 0.327 0.352 0.351 0.798 0.55

PLS PCR SMLR catechin PLS PCR SMLR albiflorin PLS PCR SMLR peoniflorin PLS PCR SMLR

0.8381 0.8358 0.6932 0.9243 0.7799 0.5144 0.7719 0.6965 0.5002 0.9655 0.4939 0.8464

0.8957 0.8801 0.7368 0.9012 0.7322 0.2753 0.7536 0.5875 0.3713 0.9117 0.3406 0.571

0.7413 0.7341 0.6508 0.8407 0.6561 0.4489 0.592 0.5501 0.4139 0.9207 0.433 0.7861

satisfactory. The circles in Fig. 4 indicate the samples of calibration set, and crosses indicate the samples of validation test. The values of r, RMSEC, and RMSEP indicated the precision achieved in calibration and validation. The high r and low values of RMSEC and RMSEP indicate that the four compounds in RPR could be reliably predicted by the optimized PLS models. The optimized results are as follows: Gallic acid: The 2nd derivative and MSC achieved the best model, and the correlation coefficient was 0.9163. The value of RMSEP was 0.0907. The correlation coefficient of calibration set was 0.9065. The RMSEC was 0.0971. The value of RMSECV was 0.13. Seven PLS factors were required in the best models. Catechin: The optimal PLS model was obtained with 2nd derivative and MSC. Here, the RMSEP was 0.404, and correlation coefficient was 0.9231. The RMSEC was 0.444, and the correlation coefficient was

Table 6 The results of PLS for NIR with different pretreatments. Compounds

gallic acid

catechin

albiflorin

peoniflorin

Pretreatment

None SNV MSC 1st der + MSC 1st der + S-G 2nd der. + MSC 2nd der + Norris None SNV MSC 1st der + MSC 1st der + S-G 2nd der + MSC 2nd der + Norris None SNV MSC 1st der + MSC 1st der + S-G 2nd der + MSC 2nd der + Norris None SNV MSC 1st der + MSC 1st der + S-G 2nd der + MSC 2nd der + Norris

Calibration

Validation

RMSEC

r

RMSEP

r

0.0891 0.0964 0.125 0.104 0.105 0.0971 0.0887 0.5 0.56 0.549 0.53 0.551 0.444 0.496 0.183 0.244 0.245 0.131 0.165 0.158 0.194 0.23 0.253 0.259 0.164 0.224 0.353 0.245

0.9217 0.9077 0.8381 0.8919 0.8905 0.9065 0.9226 0.9243 0.9039 0.9079 0.9144 0.9073 0.9408 0.9256 0.8797 0.7736 0.7719 0.9406 0.9037 0.9121 0.8646 0.9655 0.9581 0.9562 0.9826 0.9674 0.9167 0.9608

0.115 0.0935 0.0908 0.0775 0.0668 0.0907 0.0732 0.473 0.434 0.454 0.382 0.456 0.404 0.468 0.123 0.186 0.202 0.142 0.13 0.114 0.125 0.256 0.173 0.165 0.22 0.215 0.358 0.231

0.8897 0.8934 0.8957 0.9344 0.9499 0.9163 0.9343 0.9012 0.9062 0.9 0.9252 0.8991 0.9231 0.9097 0.9263 0.7999 0.7536 0.9226 0.9242 0.9405 0.9264 0.9117 0.96 0.9573 0.9565 0.9281 0.8123 0.9173

RMSECV

0.129 0.136 0.156 0.135 0.136 0.131 0.129 0.631 0.63 0.6 0.529 0.6 0.521 0.505 0.285 0.33 0.321 0.235 0.245 0.234 0.273 0.351 0.397 0.41 0.315 0.315 0.43 0.344

SNV, standard normal variate; MSC, multiplicative scattering correction; 1st der, first derivative; 2nd der, second derivative; S-G, Savitzky-Golay filter; Norris, Norris derivative filter; Best calibration are highlighted in bold.

0.9408. The RMSECV was 0.521. Seven PLS factors were required in the best models. Albiflorin: The 1st derivative and MSC afforded the best results. The correlation coefficient was 0.9226, and the RMSEP was 0.142. The correlation coefficient of the calibration set was 0.9406, and the RMSEC was 0.131. The RMSECV was 0.235. Seven PLS factors were used in the model. Paeoniflorin: The best methods for the pretreatment were 1st derivative and Savitzky–Golay filter. The correlation coefficient was 0.9281, and the RMSEP was 0.215. The correlation coefficient of the calibration set was 0.9674, and the RMSEC was 0.224. The RMSECV was 0.315. Ten PLS factors were used in the model. 3.5. PCA model A classification for all the samples originating from different areas was performed by PCA using the TQ analyst software. The quality of the fitting model can be explained by R2 and Q2 values. R2 shows the variance explained in the model and indicates the quality of the fit. A large R2 (close to 1) is a necessary condition for a good model, but it is not sufficient. Q2 shows the variance in the data, indicating the model's predictability. Generally, a large Q2 (Q2 N 0.4) indicates good predictivity. Score plots of PCA were used to visually classify the samples by their measured properties. The distribution of the samples on this graph established a pattern that correlated to the general characteristics of the samples. The scores PC1, PC2, and PC3 are new variables summarizing the X-variables. The score PC1 (first component) explains the largest variation of the X space, followed by PC2. This plot shows the possible presence of outliers, groups, and similarities in the data. Initially, all the original measured data were analyzed using PCA to test the clustering of cultivated regions. The PCA score 3D plot (Fig. 5) shows an overview of all the samples in the data. The samples from Sichuan (red spots) account for almost 50% of all samples and are widely distributed in the PCA score plot. They are mixed together with other regional samples and make it difficult to distinguish them. Although pretreatment methods such as derivatives and smoothening were conducted, a clear classification for all the samples could not be obtained. The results indicate that the quality of samples from Sichuan based on NIR spectra has a wide range and is not clearly different from the others. To improve the specificity of differentiation in other samples, the samples from Sichuan were eliminated. In the NIR spectra treated by 2nd derivative + Savitzky–Golay filter, a clear grouping trend was observed in the PCA score plot. As shown in Fig. 6, the PCA score plot (PC1 vs. PC2 vs. PC3) results in four groups (G1–G4) of RPR to be differentiated. The first three PCs account for 70.8% (R2) of the total variance (PC1, 40.4%; PC2, 19.4%; PC3, 11.0%), and Q2 was 0.484. The score plot clearly shows that the RPR obtained from Guangxi (G1), Heilongjiang (G2), Neimeng (G3), and Jinlin (G4) provinces are well differentiated from each other. In general, the results show that NIR spectroscopy combined with PCA partly describe the quality differences in RPR samples caused by the cultivation and environmental factors. 4. Conclusion In this study, a rapid NIR method was developed for the determination of gallic acid, catechin, albiflorin, and paeoniflorin in RPR. The results of different regression models were compared. Compared to PCR and SMLR, PLS has a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. Using PLS, different pretreatment methods was also compared, and optimization models for gallic acid, catechin, albiflorin, and paeoniflorin were established. This study demonstrated that NIR spectroscopy with PLS could be used for the quality control of gallic acid, catechin, albiflorin, and paeoniflorin in PRP rapidly and efficiently.

H. Zhan et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83

81

Fig. 4. Predicted vs. reference plots of the optimal PLS calibration models for gallic acid (A), catechin (B), albiflorin (C), and peoniflorin (D). The unit of x (actual) and y (calculated) axis is %. Circle: calibration and Cross: validation.

Moreover, a classification study was carried out to investigate the possibility of discriminating the original areas of RPR samples by PCA. The results show that except for the Sichuan samples, NIR spectroscopy coupled with PCA can clearly distinguish the rest of the samples originating from different areas. Overall, NIR spectroscopy with multivariate statistical analysis can be used as a quantitative, qualitative, and nondestructive analysis method for the quality evaluation of RPR.

Author contributions Participated in research design: Meihong Fu, Bin Yang. Conducted experiments: Hao Zhan, Jing Fang, Liying Tang, Hua Li. Performed data analysis: Hongwei Wu, Hao Zhan, Wrote or contributed to the writing of the manuscript: Hongwei Wu, Hongjun Yang, Hao Zhan, Zhuju Wang.

82

H. Zhan et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83

Fig. 5. The 3D Scores plot of PCA for all the samples. Red spot: samples from Sichuan Province; Blue spot: samples from other provinces. The samples from Sichuan (red spots) are widely distributed in the PCA score plot. They are mixed together with other regional samples and make it difficult to distinguish them. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 6. The 3D Scores plot of PCA for all the samples except for Sichuan Province. G1: Guangxi provinces (red spots), G2: Heilongjiang province (blue spots), G3: Neimeng province (green spots), G4: Jinlin province (purple spots). The score plot clearly shows that the RPR obtained from Guangxi (G1), Heilongjiang (G2), Neimeng (G3), and Jinlin (G4) provinces are well differentiated from each other. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Acknowledgements This study was financially supported by the Special Fund of the National Bureau of TCM (Grant No. 201407003), National Science and Technology Major Project of Drug Discovery (No. 2011ZX09201-201-26) by Ministry of Science and Technology of the People's Republic of China, National standardization project of TCM (No. ZYBZH-C-QIN-45).

References [1] China Pharmacopoeia Commission, Pharmacopoeia of the People's Republic of China, Vol.1, Beijing: Chemical Industry Press, China, 2015. [2] Xuemin Gao, Chinese Pharmacy, China press of traditional Chinese medicine, China, 2007. [3] Zhibin Wang, Lihua Wu, Xiu Liu, Qiuhong Wang, Bingyou Yang, Xueting Mo, Haixue Kuang, HPLC method for determining the contents of Paeoniflorin and analyzing the heavy metal and pesticide residue in radix Paeoniae Rubra from different areas, Info. Tradit.Chin. Med. 30 (2013) 58–61.

H. Zhan et al. / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 183 (2017) 75–83 [4] N. Murakami, M. Saka, H. Shimada, H. Matsuda, I. Yamahara, M. Yoshikawa, New bioactive monoterpene glycosides from Paeoniae radix, Chem. Pharm. Bull. 44 (1996) 1279–1281. [5] K. Kamiya, K. Yoshioka, Y. Saiki, A. Ikuta, T. Satake, Triterpenoids and floavonoids from Paeonia lactiflira, Phytochemistry 44 (1997) 141–144. [6] Weijie Zhang, Peng Wang, Mingjun Yang, Yonggang Wang, Ju Ying, Du Ruixu, Analyze and compare activities of Polysaccharide form Rhizoma Chuanxiong and Radix Paeoniae Rubra, J. Chin. Medicin. Mater. 10 (2011) 1569–1574, http://dx.doi.org/10. 13863/j.issn1001-4454. 2011.10. 011. [7] Fenghong Wang, Li Wang, Huiqin Hou, Yazhen Wang, Root medicine chemical composition and the research progress on the antitumor activity, Hebei J. Tradit. Chin. Med. 37 (2015) 614–618. [8] Geonseek Ryu, Eun Kyung Park, Jeong Hoom Joo, Bong Ho Lee, Byoung Wook Choi, DukSang Jung, Nam Ho Lee, A new antioxidant Monoterpene Glycoside,αBenzoyloxy paeonoflorin from Paeonia suffruticosa, Arch. Pharm. Res. 24 (2001) 105–108. [9] S.C. Lee, Y.S. Kwon, K.H. Son, H.P. Kim, M.Y. Heo, Antioxidative constituents from Paeonia lactiflora, Arch. Pharm. Res. 28 (2005) 775–783. [10] A.A. Abdel-Hafez, M.R. Meselhy, N. Nakamura, M. Hattori, H. Watanabe, Y. Murakami, M.A. El-Gendy, N.M. Mahfouz, T.A. Mohamed, Anticonvulsant activity of Paeonimetabolin-I adducts obtained by incubation of Paeoniflorin and Thiol compounds with lactobacillus brevis, Biol. Pharm. Bull. 22 (1999) 491–497. [11] J.F. Ye, H.L. Duan, X.M. Yang, W. Yan, X. Zheng, Anti-thrombosis effect of paeoniflorin: evaluated in a photochemical reaction thrombosis model in vivo, Planta Med. 67 (2001) 766–768. [12] H.O. Yang, W.K. Ko, J.Y. Kim, H.S. Ro, Paeoniflorin: an antihyperlipidemic agent from Paeonia lactiflora, Fitoterapia 75 (2004) 45–49. [13] Shunjun Xu, Liu Yang, Runtao Tian, Zhengtao Wang, Zhijun Liu, Peishan Xie, Qianru Feng, Species differentiation and quality assessment of Radix Paeoniae Rubra (Chishao) by means of high-performance liquid chromatographic fingerprint, J. Chromatogr. A 1216 (2009) 2163–2168. [14] Xuan Dong, Wenyuan Gao, Ying Gao, RP-HPLC method for determination the content of gallic acid, catechins, paeoniflorin, albiflorin and benzoic acid in Radix Paeoniae Rubra, Chin. Tradit. Herb. Drug 12 (2008) 1897–1898+1920. [15] Mengting Liu, Jing Fang, Hao Sun, Hongwei Wu, Fu Menghong, Hongjun Yang, HPLC double wavelength simultaneous determination of four components in Paeonia lactiflora Pall, Chin. J. Pharmacovigil. 9 (2014) 524–527. [16] X.R. Li, Y.Z. Liang, F.Q. Guo, Analysis of volatile oil in Rhizoma ligustici chuanxiong Radix paeoniae rubra by gas chromatography- mass spectrometry and chemometric resolution, Acta Pharmacol. Sin. 27 (2006) 491–498. [17] J. Liang, F. Xu, M.Y. Shang, G.X. Liu, X. Wang, S.Q. Cai, Metabolite profiling of propyl gallate in rat plasma and urine by HPLC-DAD-ESI-IT-TOF-MSn technique, China J.of Chin. Materia Medica 22 (2013) 3970–3976. [18] E.H. Liu, L.W. Qi, B. Li, Y.B. Peng, P. Li, C.Y. Li, J. Cao, High-speed separation and characterization of major constituents in Radix Paeoniae Rubra by fast high performance liquid chromatography coupled with diode-array detection and time-of-flight mass spectrometry, Rapid Commun. Mass Spectrom. 23 (2009) 119–130, http://dx.doi. org/10.1002/rcm.3848. [19] S.A. Schönbichler, L.K. Bittner, J.D. Pallua, M. Popp, G. Abel, G.K. Bonn, C.W. Huck, Simultaneous quantification of verbenalin and verbascoside in Verbena officinalis by ATR-IR and NIR spectroscopy, J. Pharm. Biomed. Anal. 84 (2013) 97–102, http:// dx.doi.org/10.1016/j.jpba.2013.04.038 (Epub 2013 May 20). [20] Lijuan Xie, Xingqian Ye, Donghong Liu, Yibin Ying, Quantification of glucose, fructose and sucrose in bayberry juice by NIR and PLS, Food Chem. 114 (2009) 1135–1140. [21] Makoto Suzuki, Miyako Kusano, Hideki Takahashi, Yumiko Nakamura, Naomi Hayashi, Makoto Kobayashi, Takanari Ichikawa, Minami Matsui, Hirohiko Hirochika, Kazuki Saito, Rice-Arabidopsis FOX line screening with FT-NIR-based

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30] [31] [32]

[33]

[34]

[35] [36]

[37] [38]

83

fingerprinting for GC-TOF/MS-based metabolite profiling, Metabolomics 6 (2010) 137–145, http://dx.doi.org/10.1007/s11306-009-0182-2. Kommer Brunt, Bernard Smits, Herman Holthuis, Design, Construction, and Testing of an Automated NIR In-line Analysis System for Potatoes. Part II. Development and Testing of the Automated Semi-industrial System with In-line NIR for the Characterization of Potatoes, Potato Res. 53 (2010) 41–60, http://dx.doi.org/10.1007/s11540010-9148-z. M. Ecarnot, P. Bączyk, L. Tessarotto, C. Chervin, Rapid phenotyping of the tomato fruit model, Micro-Tom, with a portable VIS-NIR spectrometer, Plant Physiol. Biochem. 70 (2013) 159–163, http://dx.doi.org/10.1016/j.Plaphy.2013.05.019 (Epub 2013 May 29). Magdi M. Mossoba, Hormoz Azizian, Cynthia Tyburczy, John K.G. Kramer, Pierluigi Delmonte, Ali-Reza Fardin Kia, Jeanne I. Rader, Rapid FT-NIR Analysis of Edible Oils for Total SFA, MUFA, PUFA, and Trans FA with Comparison to GC, J. Am. Oil Chem. Soc. 90 (2013) 757–770, http://dx.doi.org/10.1007/s11746-013-2234-z. Zhenyao Liu, Tao Pan, Jidong Yang, Determination of amino acid nitrogen in tuber mustard using near-infrared spectroscopy with waveband selection stability, Spectrochim. Acta A Mol. Biomol. Spectrosc. 102 (2013) 269–274. Julio Cesar L. Alves, Claudete B. Henriques, Ronei J. Poppi, Classification of diesel pool refinery streams through near infrared spectroscopy and support vector machines using C-SVC and v-SVC, Spectrochim. Acta A Mol. Biomol. Spectrosc. 117 (2014) 389–396. Lian Li, Hengchang Zang, Jun Li, Dejun Chen, Tao Li, Fengshan Wan, Identification of anisodamine tablets by Raman and near-infrared spectroscopy with chemometrics, Spectrochim. Acta A Mol. Biomol. Spectrosc. 127 (2014) 91–97. I. Tomuta, L. Rus, R. Iovanov, L.L. Rus, High-throughput NIR-chemometric methods for determination of drug content and pharmaceutical properties of indapamide tablets, J. Pharm. Biomed. Anal. 84 (2013) 285–292, http://dx.doi.org/10.1016/j. jpba.2012.12.020 (Epub 2012 Dec 22). Nathalie Dupuy, Oswin Galtier, Yveline Le Dre'au, Christian Pinatel, Jacky Kister, Jacques Artaud, Chemometric analysis of combined NIR and MIR spectra to characterize French olives, Eur. J. Lipid Sci. Technol. 112 (2010) 463–475. Xiaofang Luo, Xiang Yu, Xiaoming Wu, Yiyu Cheng, Qu. Haibin, Rapid determination of Paeoniae Radix using near infrared spectroscopy, Microchem. J. 90 (2008) 8–12. D. Pérez-Marín, A. Garrido-Varo, J.E. Guerrero, Non-linear regression methods in NIRS quantitative analysis, Talanta 72 (2007) 28–42. Shilin Hu, Fu Guilan, Xuefeng Feng, Xiaojun Tang, Xirong He, Determination of Paeoniflorin from different producing areas and Locations in Radix Paeoniae Rubra, China J.of Chin. Materia Medica 12 (2000) 10–12. M.M. Paradkar, Joseph Irudayaraj, A rapid FTIR spectroscopic method for estimation of caffeine in soft drinks and total methylxanthines in tea and coffee, J. Food Sci. 657 (2002) 2507–2511. W. Dong, Y. Ni, S. Kokot, A near-infrared reflectance spectroscopy method for direct analysis of several chemical components and properties of fruit, for example, Chinese hawthorn, J. Agric. Food Chem. 61 (2013) 540–546, http://dx.doi.org/10. 1021/jf305272s (Epub 2013 Jan 11). Changqin Hu, Near Infrared Spectroscopy for Rapid Analysis of Drugs, Chemical Industry Press, China, Beijing, 2010. Q. Chen, J. Zhao, M. Liu, J. Cai, J. Liu, Determination of total polyphenols content in green tea using FT-NIR spectroscopy and different PLS algorithms, J. Pharm. Biomed. Anal. 46 (2008) 568–573. Xiaoli Chu, Molecular Spectroscopy Analytical Technology Combined with Chememetrics and Its Applications, Chemical Industry Press, China, Beijing, 2011. D. Xiang, M. Konigsberger, B. Wabuyele, K. Hornung, J. Cheney, Development of robust quantitative methods by near-infrared spectroscopy for rapid pharmaceutical determination of content uniformity in complex table matrix, Analyst 34 (2009) 1405–1415, http://dx.doi.org/10.1039/B821836F.