Estimation of Catechins Concentration of Green Tea Using Hyperspectral Remote Sensing Yusuke Sohara *. Chanseok Ryu **. Masahiko Suguri ***. Si-bum Park ****. Shigenobu Kishino *****.
*Graduate School of Agriculture, Kyoto University, Kyoto, Japan (Tel:+81-75-753-6167; e-mail:
[email protected]) ** Graduate School of Agriculture, Kyoto University, Kyoto, Japan (Tel:+81-75-753-6317; e-mail:
[email protected]) *** Graduate School of Agriculture, Kyoto University, Kyoto, Japan (Tel:+81-75-753-6279; e-mail:
[email protected]) **** Graduate School of Agriculture, Kyoto University, Kyoto, Japan, (Tel:+81-75-753-6114; e-mail:
[email protected]) ***** Graduate School of Agriculture, Kyoto University, Kyoto, Japan, (Tel:+81-75-753-6114; e-mail:
[email protected]) Abstract: Models for the estimation of the concentration of catechins concentration in new green tea shoots were established using ground-based hyperspectral remote sensing. The coefficient of determination (R2) was determined to be more than 0.913, the root mean squared error of prediction (RMSEP) was determined to be less than 0.617 % and the relative error of prediction (REP) was determined to be less than 6.17%, except in the EGC model (R2=0.512, RMSEP=0.272%, and REP=15.7%). The regression coefficients of the green, red edge and near infrared regions were all changed, indicating that those regions were important for the estimation of catechin concentration. A similar trend was noted for the regression coefficients of ECg and EGCg. Therefore, the X -loadings of the first latent variables of ECg and EGCg (ester-type catechins) and EC and EGC (free-type catechins) were compared and the similarities between each type of catechin were determind. Therefore, each type of PLS regression model was designed based on date of the ester- and free-type catechins. The accuracy of the free-type model was as follows: R2=0.774, RMSEP=0.273% and REP=7.85%. The accuracy of ester-type model was as follows: R2=0.869, RMSEP=0.991% and REP=6.99%. The regression coefficients of the free-type catechins differed from those of the ester-type catechins. Large changes to the regression coefficients of the green to red, and red edge regions were also demonstrated. Keywords: catechins concentration, hyperspectral image, the reflectance of new shoots, PLS regression, Regression coefficient
1. INTRODUCTION Tea is one of the world’s most popular beverages and has been cultivated for thousands of years. Japan’s production volume (PV) of tea, mainly green tea, is 0.1 million tons/year and the 8th largest in the world (FAO, 2008). The main brands of green tea, are: Shizuoka (41% of PV), Kagoshima (18% of PV), and Uji (3% of PV) (MAFF, 2009). New shoots are generally harvested in the middle of May in Uji, where the best quality tea is harvested, although PV is less in Uji than in the other places. Recently, tea’s beneficial medical properties, such as its possible ability to prevent cancer, have received considerable attention (Yang et al. 1993). As tea grows, the yield increases, but the quality deteriorates. Therefore it is necessary for farmers to harvest new shoots are harvested so that the yield and quality are well balanced. However, the timing of harvesting new shoots is depending on their experience, but it is ambiguous. Therefore it is necessary to constitute the invariant method to help determine the optimum plucking time so that the yield and quality will be well balances.
Generally, the quality of green tea has a negative correlation with its quantity (Saba et al. 1993). In the case of quality, the nitrogen concentrations (Goto et al. 1986), amino acids (Horie et al. 1988), crude fiber (Smiechowska et al. 2006) and catechins concentrations (Ikegaya et al. 1988) have been used as chemical constituents to maintain the standards of quality. The taste that is considered one of the standards of tea quality and it consists of the sweetness, Umami, bitterness, and astringency (Ruan et al. 2007). Although bitterness is mainly influenced by the concentration of catechins, it has received attention for the biological and pharmacological activities (Yang et al. 2002). Generally, there are four types of catechins in tea; epicatechin (EC), epigallocatechin (EGC), epicatechingallate (ECg) and epigallocatechingallate (EGCg). Although the typical methods for determining the concentration of catechins, such as Kjeldahl or HPLC, are precise, they are also time consuming and laborious. Therefore, several studies have been carried out on non destructive methods such as near-infrared (NIR) spectroscopy (Chen et al. 2006) and H-NMR spectroscopy (Gall et al. 2004).However, these studies were not based on the shoots that are currently growing, but ont tea powder of green tea. Therefore, the amount of nitrogen concentration in new
shoots in the field was estimated by the hyperspectral image (Ryu et al. 2007). One of the advantages of hyperspectral imaging is its ability to possibly measure the reflectance of a target without receiving interference from any other parts of the plant. Therefore, the objectives of this study are to establish catechin concentration estimation model based on the reflectance of new shoots in the field, fetermine the accuracy of this model and find out the characteristics of catechins based on the reflectance of new shoots.
Samples were then finely ground using a pulverizing mill. (Ryu et al. 2007)
2. MATERIALS AND METHODS 2.1 Experimental field The experimental field was located at Wazuka-Cho, SourakuGun, Kyoto prefecture, Japan (135°55'E, 34°46'N and 358 m above sea level). The cultivar of tea planted in this field was Camellia sinensis L, Yabukita, which is cultivated for “Matcha”. Matcha field was covered with shade screen to slow the growth of the plants and deepen the color of their leaves as shown in Fig. 1. Chemical and organic fertilizers were applied in this field. The test crop was covered on 31 April with screen net and harvested on 20 May, in 2009. Five sampling points were used in this field and the experiment was carried out every three or four days.
Fig. 1. Experimental field of tea 2.2 Hyperspectral remote sensing The images were taken at five sampling points using a QEV10E hyperspectral camera (Specim), as shown in Fig. 2. This camera measured the reflectance of tea leaves placed in the circular frame (d= 300 mm). The camera was a line scan camera, and it was positioned about 1 m above the circular frame and the resolution at that position was approximately 2 mm × mm. Reflectance of the new shoots was measured in 1024 discrete and narrow bands between 400 and 1000 nm. A reflectance board was positioned on either sides of the circular frame in order to reduce the influence of variation in the intensity of incoming light. 2.3 Sampling New shoots that had more than “pokoe” (a new shoot with two leaves and a bud) in the circular frame were sampled. The samples were parched by the 500W microwave oven and dried for 24 hours at 60°C by circulation oven drier. The dry mass of each sample was measured using an electronic scale.
Fig. 2. Arrangement for taking pictures 2.4 High performance liquid chromatography (HPLC) Approximately 50 mg of the powdered material is extracted with 5 ml 50% aqueous acetonitrile containing 0.085% phosphoric acid under ultrasonic treatment for 30 min. The extracts are centrifuged at 1,500 g for 10 min. The resultant solution was diluted two times with water and filtered through a 0.45 micro meter filter. The filtrated samples were separated by reverse-phase high-performance liquid chromatography using a Shimadzu LC-VP system (Shimadzu) equipped with a Develosil ODS-HG-5 column (4.6 mm x 150 mm, Nomura chemical) placed in a column oven set at 40 and a UV detector (SPD-20AV, Shimadzu) set at 230 nm. The separation of major catechins was carried out by using 0.085% phosphoric acid (elute A) and 40% acetonitrile in 0.085% phosphoric acid (elute B). The gradient used was: 0-10 min, 20% B; 10-30 min, linear gradient 20-49% B; 30-40 min, 75% B; 40-55 min, 20% B. The flow rate was 1.0 ml/min. Individual peaks were identified by the comparison of their retention times with those of authentic standards. All experiments were carried out in triplicate. The catechins concentration was calculated as the percentage of catechins per unit dry weight. 2.5 Image processing Environment for Visualizing Image software ENVI version 4.5 (Research Systems) was used to separated each image into two parts: 1) new shoots and 2) others (mainly old leaves), as shown in (1). The nitrogen concentrations of young leaves differs from that of old leaves and this difference can be estimated using NDVI and Green NDVI. Therefore, the difference between NDVI and Green NDVI was utilized to separate new shoots from others. The reflectance of new shoots was divided by the reflectance of the reflectance board and defined it “Reflectance”, as shown in equation (2). GreenNDVI NDVI RNIR RGreen RNIR RRed RNIR RGreen RNIR RRed
(1)
Reflectance of new shoots
Reflectance =
Reflectance of reflect board
Table 1. Temporal variation in catechins concentrations
(2)
where, RNIR, RGreen and RRed represent the reflectance of each band of light indicated by the subscript.
Fig. 3. Green NDVI-NDVI image Figure 3 shows Green NDVI-NDVI image. The new shoots are shown in a thick black. The image was separated into two parts using Green NDVI-NDVI. The threshold values were variable depending on the influence of the intensity of the incoming light, and the range of it was from -0.33 to -0.13.
PLS regression models were established using tatistics software The Unscrambler version 9.6(CAMO) and data on the catechins concentrations and reflectances. The validity of each model was analyzed using full cross validation. The accuracy of each model was evaluated using the coefficient of determination (R2), root mean squared error in prediction (RMSEP) of the full-cross validation, and the relative error of prediction (REP), as shown in (3) and (4). RMSEP indicates the numerical value of error and REP indicates the ratio of error. (Feng et al. 2008)
REP
¦ ( xˆ x ) i
2
i
n xˆ i x i 100 ¦ n xi
Although the concentration of EGC increased during the initial growth stages, it was shown to decrease after 4 May. The amount concentration of free–type catechins, which consist of EC and EGC was less than 3.5% of the total weight of dry mass. The concentration of ester-type catechins, which consist of ECg and EGCg, was less than 16% of the total weight of dry mass. The decreasing rate of each catechin slowed after 11 May. This may have been influenced by the shade screen that was used to cover the tea plants in 31 April. 3.2 Reflectance
2.6 PLS regression models
RMSEP
EC EGC ECg EGCg 4/24 1.65 ± 0.07 1.43 ± 0.22 4.89 ± 0.55 11.4 ± 0.89 4/28 1.74 ± 0.07 1.74 ± 0.27 4.28 ± 0.37 10.4 ± 0.72 5/1 1.66 ± 0.14 2.04 ± 0.32 3.75 ± 0.36 9.66 ± 0.72 5/4 1.51 ± 0.05 1.74 ± 0.27 3.21 ± 0.31 8.44 ± 0.48 5/8 1.36 ± 0.09 1.45 ± 0.23 2.83 ± 0.39 7.60 ± 0.74 5/11 1.26 ± 0.08 1.16 ± 0.20 2.39 ± 0.15 6.68 ± 0.20 5/14 1.09 ± 0.04 1.22 ± 0.19 2.38 ± 0.34 7.02 ± 0.73 5/18 1.08 ± 0.03 1.25 ± 0.20 2.29 ± 0.35 6.96 ± 0.90 5/20 1.00 ± 0.07 1.25 ± 0.22 2.20 ± 0.16 6.86 ± 0.32 (All values are mean ± standard deviation [%], bold letters are after covering the screen net)
(3) (4)
Figure 4 shows the reflectance values of the new shoots. The reflectance was variable from green region, although the reflectance at blue region was not variable. The reflectance of the NIR region and green region were notably higher than other regions. 3.3 Results of PLS regression 3.3.1 Results of PLS regression analyses of each catechin Table 2 shows the results of the PLS regression analyses of each catechin. R2 was more than 0.913, RMSEP was less than 0.617 %, and REP was less than 6.17%, except in the EGC model (R2=0.512, RMSEP=0.272%, and REP=15.7%). When the accuracies of the models were evaluated using REP, EC was found to be the most precise and ECg and EGCg were less than 10%. RMSEP and RE values of EGC were higher than the RMSEP and REP values of other catechins. These results may have been influenced by temporal variation.
where, n is the number of samples, xi and xˆi are the measured and predicted values of the catechins concentrations of tea. 3. RESULTS AND DISCUSSION 3.1 Temporal variation of catechins concentration Table 1 shows temporal variations of each catechins concentration evaluated in 2009. These values are the concentration of each catechin based on the dry mass. Each catechin concentration decreased as tea grew. The concentration of EGCg was higher than any other catechins. Fig. 4. Reflectance of new shoots
Table 2. Results of PLS regression modelling of each catechin PCs
EC 8
EGC 2
ECg 4
EGCg 4
R2
0.966
0.512
0.926
0.913
RMSEP[%] REP [%]
0.100 2.93
0.272 15.7
0.330 8.98
0.617 6.17
(PCs: number of latent variables, RMSEP: root mean square of error for prediction, REP: relative error of prediction) EGC concentration increased during the initial growth stages and then decreased, as shown in Table 1. Therefore, it was possible to estimate the concentration of each catechin in new shoots during growth stage using hyperspectral remote sensing. 3.3.2 Measured and predicted catechin concentrations Figure 5 shows the measured and predicted concentrations of EC. This plot can be separated into two groups, depending on before or after the shading screen was placed. The catechins concentration before and just after the shading screen net placed (from 24 April to 1 May) was higher than 1.5%. However those were lower than 1.5% that was after shading screen net. Thus, the results of this study may have been influenced by the timing and placement of the shading screen. This means that it may be possible to control the concentration of catechins by adjusting the time the tea plant spends in the shade. Figure 6 shows the measured and predicted concentrations of EGC as one of the not good example. The plots were not equally distributed from initial growth stage to harvesting. This finding may have been influenced by temporal variations in EGC concentration, as shown in Table 1. Nevertheless there were the same EGC concentrations, the reflectance of new shoots between before and after the peak might be different, and so this result may have been happened.
Fig. 6. Measured and predicted concentrations of EGC 3.3.3 Regression coefficients of each catechin concentration Figure 7 shows the regression coefficients of each catechin concentration. The scale of the regression coefficient of EGC was changed in order to compare the important variables with the overall pattern of regression coefficients. The regression coefficients of the green, red-edge, and NIR regions changed. This indicates that those regions may be important for estimating the concentration of each catechin. The regression coefficients of EC were more variable, possibly because they were influenced by the number of latent variables (PCs). There were similarities in the regression coefficients of the ester-type catechins (ECg and EGCg) but not those of the free-type catechins (EC and EGC). Therefore, the first latent variables (PC1) were compared with each other. Figure 8 shows the X-loadings of PC1 of each catechin. There was a similar trend for the first latent variable (PC1) of ECg and EGCg (ester-type catechins) and between EC and EGC (free-type catechins). Differences in X-loading between the ester- and free-type catechins appeared in the visible, rededge, and NIR regions (800–1000nm). This indicates that it may be possible to separate these two types of catechins using hyperspectral reflectance.
Fig. 5. Measured and predicted concentration of EC Fig. 7. Regression coefficients of each catechin
Fig. 8. Regression coefficients of each catechin (PC1)
Fig. 9. Regression coefficients of each type of catechins
3.3.4 Results of PLS regression modeling of ester- and freetype catechins
Moreover, it may be necessary to accumulate enough data to make a year-variant model because the concentration of catechins was variable depending on several conditions, including temperature, how much shade the plants received, and the nitrogen content of the new shoots. This indicates that the regression coefficients of the free-type catechins were influenced mainly by the concentration of EC.
PLS regression models of each type of catechin were recalculated based on whether they were the ester- or freetype catechins (Table 3). The accuracy of the free-type model was calculated as follows: R2 =0.774, RMSEP = 0.273 % and REP = 7.85 %. The accuracy of the ester-type model was calculated as follows: R2 = 0.869, RMSEP = 0.991 % and REP = 6.99 %. The R2 values of the ester-type catechins were higher than the R2 values of the free-type catechins. The REP values of the ester-type was lower than the REP values of the of free-type catechins. Compared with ECg and EGCg (Table 2), the R2 values of the ester-type catechin model decreased and RMSEP values increased, however REP values of ECg and EGCg were similar to each other despite a decrease in the number of latent variables. In the case of the free-type catechin models, the precision of these models may have been influenced by the differences in the EC and EGC models. 3.3.5 Regression coefficients of each type of catechins concentration Figure 9 shows the regression coefficients of each type of catechin. The regression coefficients of the free- and estertype catechins show completely different trends. Because the number of latent variables for the regression coefficients of the free-type of catechins was greater than that of the freetype of catechins, it was changeable. Large changes in the regression coefficients were observed in the green, red, and red-edge region, as previously mentioned. The trends of the free-type catechins were similar to the trends of EC, as shown in Fig. 7.
3.4 Measured and predicted of each type of catechins concentration Figure 10 shows the measured and predicted concentrations of the free-type catechins. Free-type catechins were separated into two groups depending on how much shade the plants received.The catechins before shading (from 24 April to 1 May) were higher than around 3.0% and those were lower than around 3.0% after shading (from 4 May to 20 May ). Figure 11 shows the measured and predicted concentrations of ester-type catechins. The ester-type catechins were also separated into two groups how much shade the plants received. The catechins before shading (from 24 April to 1 May) had concentrations greater than approximately 11%, and those were lower than around 11% after shading (from 4 May to 20 May). This indicates that it may be possible to control the concentration of the ester-type catechins by controlling the amount of shade the plants receive.
Table 3. PLS regression modeling for each type of catechin PCs
free-type 7
ester-type 3
R2 RMSEP[%] REP[%]
0.774 0.273 7.85
0.869 0.991 6.99
(PCs: number of latent variables, RMSEP: Root mean square of error for prediction, REP: relative error of prediction)
Fig. 10. Measured and predicted concentrations of free-type catechins
Fig. 11. Measured and predicted of ester-type of catechins In this study, it was possible to estimate the concentration of each type of catechin using a ground-based hyperspectral remote sensing. However, it may be necessary to carry out the mutual prediction using the data of the other year in order to analyze the precision of each model. Moreover, it may be necessary to accumulate enough data to make a year-variant model because the concentration of catechins was variable depending on several conditions, including temperature, how much shade the plants received, and the nitrogen content of the new shoots. 4. CONCLUSIONS Models for determining the concentration of catechins in new shoots were established using the ground-based hyperspectral remote sensing. R2 values were more than 0.913, RMSEP values were less than 0.617%, and REP values were less than 6.17%, except in the EGC model (R2 = 0.512, RMSEP = 0.272 %, REP = 15.7%). Regression coefficients changed in the green, red-edge, and NIR regions. This indicates that those regions are important for the estimation of catechin concentration. The regression coefficients of ECg and EGCg shared similar trends. Therefore, each PLS regression model was recalculated based on whether it was modeling ester- or free-type catechins. The accuracy of the free-type model was calculated as follows:R2 =0.774, RMSEP = 0.273 % and REP = 7.85 %. The accuracy of ester-type model was calculated as follows: R2 = 0.869, RMSEP = 0.991 % and REP = 6.99 %. The regression coefficients of the free-type catechins differed from the regression coefficients of the ester-type catechins. Large changes in regression coefficients were noted in the green, red, and red-edge regions. REFERENCES Chen, Q., Zhao, J., Huang, X., Zhang, H., and Liu, M. (2006). Simultaneous determination of total polyphenols and caffeine contents of green tea by near-infrared reflectance spectroscopy. Micro chemical Journal, 83, 42-47. Feng, W., Yao, X., Zhu, Y., Tian Y.C., and Cao, W.X. (2008). Monitoring leaf nitrogen status with hyperspectral reflectance in wheat. European Journal of Agronomy, 28, 394-404.
Gall, G. L., Colquhoun, I. J., and Defernez, M. (2004). Metabolite Profiling Using H NMR Spectroscopy for Quality Assessment of Green Tea, Camellia sinensis. J. Agric. Food Chem, 52, 692-700. Goto, T., Uozumi and Suzuki, T. (1986). Near infrared spectoroscopic analysis of total nitrogen content in tea “Sen-cha”. Bull. Shizuoka Tea Exp (in Japanese with English summary), 12, 61-81. Horie, H., Fukatsu, S., Mukai, T., Goto, T., Kawanaka and Shimohara, T. (1988). Quality evaluation on green tea. Sensors and Actuators B(Kanetani)13-14, 451-454. Ikegaya, K., Takayanagi, H., Anan, T., Iwamoto, T., Uozumi, J., Nishinari, K., Cho, R. (1988). Determination of the content of total nitrogen, caffeine, total free amino acids, theanine and tannin of sencha and maccha by near infrared reflectance spectroscopy. Bulletin of the National Research Institute of Vegetables, Ornamental Plants and Tea. Series B, 47-90. Ruan, J., Gerendás, J., Haerdter, R., Sattelmacher, B. (2007). Effect of alternative anions (Cl-vs. SO2-4) on concentrations of free amino acids in young tea plants. Journal of plant nutrition and soil science, 170, 49-58 Ryu, C.S., Suguri, M., Asai, A., Kurimoto, T., and Umeda, (2007). M., Estimation of quality of green-tea by hypersprctral image, 2nd Advanced Nondestructive Evaluation, 988-993. Smiechowska, M., Dmowski, P. (2008). Crude fibre as a parameter in the quality evaluation of tea. Food Chemistry, 94, 366-368. Saba, T., Aono, H., Tanaka, S. (1993). Methods for the Estimation of Optimum Plucking Time in Reration to. Tea Quality Based on the Characteristics of New Shoots of Tea Plants. Bulletin of the National Research Institute of Vegetables, Ornamental Plants and Tea. Series B, 6, 11-20. Yang, C.S., Maliakal, P., and Meng, X. (2002). Inhibition of carcinogenesis by tea. Annual Review of Pharmacology and Toxicology, 42, 25-54. Yang, C.S., Wang, Z. Y. (1993). Tea and Cancer. Journal of National Cancer Institute, 85, 1038-1049.