Quality level identification of West Lake Longjing green tea using electronic nose

Quality level identification of West Lake Longjing green tea using electronic nose

Sensors & Actuators: B. Chemical 301 (2019) 127056 Contents lists available at ScienceDirect Sensors and Actuators B: Chemical journal homepage: www...

2MB Sizes 1 Downloads 46 Views

Sensors & Actuators: B. Chemical 301 (2019) 127056

Contents lists available at ScienceDirect

Sensors and Actuators B: Chemical journal homepage: www.elsevier.com/locate/snb

Quality level identification of West Lake Longjing green tea using electronic nose

T



Xiaohui Lua, Jin Wanga, , Guodong Lua, Bo Linb, Meizhuo Changa, Wei Hea a b

State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou 310027, China College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China

A R T I C LE I N FO

A B S T R A C T

Keywords: Tea aroma Electronic nose Quality level identification Machine learning

China grows and consumes numerous types of tea, which have diverse processing techniques. West Lake Longjing Tea is one of the most famous and popular varieties of tea in China. It is difficult for consumers to assess the quality of Longjing green tea, as it usually requires well-trained experts to make the judgement based on colour, aroma, and taste. To this end, we propose a quality identification system consisting of a self-developed electronic nose and a data analysis algorithm to assess the quality of West Lake Longjing Tea based on its aroma. The equipment was tested extensively in experiments conducted on real-world data. The results show that the proposed system is capable of distinguishing the tea grades accurately. Furthermore, we studied the quality specifications of Longjing tea sold by different brands and found that standard certified brands have more accurate quality identification criteria than non-standard certified brands. Our findings will assist customers and tea factories in evaluating the quality of Longjing Tea and guide the optimisation of quality standards.

1. Introduction China is the world’s second largest exporter of tea. Fifty percent of the tea grown in China is green tea. The primary market of West Lake Longjing Tea is China. It is one of the top ten popular tea varieties in China and has a history of more than 1200 years. Its brand value in China exceeds ¥5 billion [1]. The annual sales volume of West Lake Longjing Green Tea in China surpasses 3000 t. It is mainly grown in the West Lake Cultural Landscape of Hangzhou, which was listed by UNESCO as a World Heritage Site in 2011. However, there are countless West Lake Longjing tea varieties in the market, which are grown and sold by different producers and brands. Longjing tea grades vary widely from brand to brand, and the cost varies from hundreds of yuan per kilo to tens of thousands of yuan per kilo. Each company has its own pricing standard. Moreover, the quality and authenticity of the tea produce may not be guaranteed. Therefore, a method for the quality identification of the West Lake Longjing green tea has important theoretical and practical value. Different Longjing tea varieties have different flavours because of several reasons, such as the quality of tea leaves, production process, storage time, and storage environment. Each brand has its specific method for grading the quality of Longjing tea, which ordinary consumers may find difficult to understand and practise.

Quality identification of tea requires general sensory evaluation. Tea experts rate the quality of the tea samples based on various aspects. High quality tea typically has a mellow taste, fragrant aroma, and elegant appearance [2]. Nowadays, researchers use many advanced technologies to analyse the effect of the chemical composition of tea [3]. These technologies are challenging for normal customers or small tea companies to use. Therefore, we aim to develop auxiliary methods to identify the differences between various varieties of West Lake Longjing green tea. Tea aroma is one of the most important factors that determine the tea quality. The most common and accurate method for distinguishing the quality level of green tea is by employing tea experts to observe, smell, and taste different green tea samples. However, this method consumes time, money, labour, and material resources. In order to gather data on tea aroma, we built an electronic nose (e-nose) system to simulate a tea expert’s nose. E-nose is more effective in synthetic evaluation of tea aroma than the human nose. Human nose can only determine if the green tea aroma is similar to that of fresh herbs. Therefore, we decided to use the E-nose as a simpler alternative for assessing the tea quality. Longjing tea has more than 500 aroma components, which can be separated into more than 20 categories. Hundreds of volatile compounds that form the Longjing green tea aroma have been analysed by gas chromatography mass spectrometry



Corresponding author. E-mail addresses: [email protected] (X. Lu), [email protected] (J. Wang), [email protected] (G. Lu), [email protected] (B. Lin), [email protected] (M. Chang), [email protected] (W. He). https://doi.org/10.1016/j.snb.2019.127056 Received 10 April 2019; Received in revised form 7 August 2019; Accepted 26 August 2019 Available online 29 August 2019 0925-4005/ © 2019 Published by Elsevier B.V.

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

organizing map (i-SOM) [22], fuzzy ART [23], and fuzzy based response of signal with time (FRST) [24] are widely used for data analysis. The classification rates of ANN [21], fuzzy ART [23], and FRST [24] are 82.321%, 81.2%, and 90.11%, respectively, when used with an e-nose. Hence, an electronic tongue is added to obtain a better result, which may be costly. SOM [15] mainly focuses on tea quality during processing but not the tea product. The classification rate of i-SOM [22] is lower than that of our results. Supervised learning methods such as support vector machine (SVM), One-vs-Rest (OVR) [25], and Bayesian classifier [26] are useful data analysis techniques. Methods such as Genetic Algorithm (GA) [27] and Wavelet Transform (WT) [26] were mainly designed for estimating the quality of black tea in India and England. Yu et al. proposed three methods for identifying the quality grade of green tea. The first method used principal component analysis (PCA), cluster analysis (CA), back-propagation neural network (BPNN), and probabilistic neural network (PNN) [28]. The second method used PCA and linear discriminant analysis (LDA) [29]. The third method used PCA, LDA, and BPNN [30]. Their study required 30 samples for determining the quality level of each Longjing green tea, whereas our study requires only four samples. In addition, our results show greater accuracy than their results. A new e-nose system was developed in this work. We chose a series of gas sensors especially for sensing Longjing tea aroma. Experiments were designed to analyse the quality of Longjing tea available in the market. The main objective of this research is to present an effective quality identification system that can be used by laypersons to easily differentiate between different Longjing tea grades and brands.

(GC–MS). The major aroma components are 2-acetylpyrrole, geraniol, phenylethyl alcohol, benzyl alcohol, linalool, and 4-vinylphenol. Longjing green tea has the aroma of fresh herbs, which is ascribed to 2acetylpyrrole, geraniol, and phenylethyl alcohol. These three components have been reported to have a ‘herbal or floral rose-like’ odour [4]. Because no sensor has been especially manufactured for these components, we focus on identifying sensors for their general categories, the most important of which are aromatic hydrocarbons, alkane, and alkene [5]. Sensors used in this research will be chosen based on the target aroma of Longjing tea. The main purpose of this research is to build a Longjing tea quality identification system, which can be easily used by tea companies, tea factories, customers, and tea research institutes. In this study, we first built an E-nose system containing well-chosen gas sensors to test the tea aroma components. Secondly, we experimentally tested the E-nose on different Longjing tea samples. Thirdly, we analysed the tea aroma data obtained from the experimental results. Finally, based on data analysis results, we proposed a method to distinguish the quality of Longjing tea easily. The main research objectives of this paper are as follows: A self-developed e-nose system is presented to identify Longjing tea of different brands and quality levels. The identification results of the blind samples prove the efficacy of our system. The proposed method based on random forest is compared with other machine learning methods to demonstrate the superior performance of our method. Analysis results show that all sensors contribute to the classification results. Our study provides an effective reference method for Longjing tea market standardisation. The remainder of this paper is structured as follows: Section 2 discusses studies related to E-nose, tea grading methods, and other advanced technologies relevant to our research. Section 3 describes the materials and methods used in our experiments on Longjing tea samples, the structure of the self-made E-nose system, details of the experimental design, and the aroma detection methods. Section 4 proves the practicability of our system and designs a series of cross-contrast tests to provide a reference for the establishment of industry standards for Longjing tea grade. Section 5 presents the concluding remarks.

3. Materials and methods 3.1. Experimental objects Table 1 lists 7 groups of West Lake Longjing green tea bought from different companies at different prices. ‘A’, ‘B’, ‘C’ represent the quality level of the tea. The 7 groups can be divided into two categories, namely, S and N. The groups ‘SA’, ‘SB’, and ‘SC’ are the standard certified set ranked by professional tea experts. Thus, their classification grade is certified. The levels of other 4 non-standard certified sets are represented by ‘N’; these are assigned by their manufacturing companies. The superscript numbers represent two different tea companies. Thus, ‘N1A’ and ‘N1B’ are two Longjing tea samples of different quality levels from the same tea manufacturer. In reality, only professionally trained tea experts are qualified to assess the tea quality. Tea experts assign a quality level to each tea sample depending on its colour, aroma, taste, and shape. We mainly focus on quantifying the relationship between tea quality level and tea aroma. In addition, we intend to evaluate the importance of different gas sensors in the identification of the quality level of tea.

2. Related work The chemical components of tea aroma are the most important factors to consider in the quality assessment of tea. The most common and rapid method to determine the volatile aroma components is GC–MS [6–8]. However, gas chromatography is an expensive method. Moreover, the analysis of the relationship between volatile components and tea grade is very complicated for laypersons. Therefore, many methods such as electronic eyes, E-nose [9,10], and electronic tongue [11–14] have been proposed in literature. An E-nose is the most practical way to test tea aroma. The core component in an electric nose is the gas sensor. Different gas sensors are sensitive to different types of gas [15]. Therefore, an E-nose is selected depending on the target gas to be detected [16]. Gas molecules react chemically or physically with the material inside the gas sensor, which generates a series of response signals from the gas sensor to the processor [17]. The reaction process varies among sensors. For example, conducting polymer composite sensors treat conducting polymers as the active layer [18]; metal oxide sensors focus on the conductance change during the reaction [19]; and semiconductor gas sensors rely on the decrease in the resistance value [20]. Metal-oxide semiconductor sensors have many advantages such as fast response, wide practical applicability, relatively low cost, and easy integration. Therefore, we selected metal-oxide semiconductor sensors to build our E-nose. Different possibilities are available for the analysis of tea aroma data. Machine learning is the most common method for building classification models. Although there are several branches of machine learning methods, neural networks such as artificial neural networks (ANN) [21], self-organizing map (SOM) [15], incremental self-

3.2. Data acquisition equipment Tea experts consider tea aroma as one of the most important factors Table 1 West Lake Longjing green tea samples used as experimental materials.

2

Label

Price (CNY/500 g)

SA SB SC N1A N1B N2A N2B

500 260 100 850 336 690 345

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

two gas passages (acquisition gas passage and cleaning gas passage). In addition, silica gel drying tube was used to absorb water vapour, as excessive water vapour affects the reaction process of the gas sensors and reduces its service life The SILICA gels is very suitable for this purpose, as it has stable chemical properties; moreover, it is non-toxic, odourless, reusable, and affordable. STC12C56AD Single Chip Microcomputer (SCM) is the processing system used for data acquisition. The STC12C56AD SCM has many advantages such as small volume, stable performance, and low cost. In our e-nose system, the SCM plays an important role as the lower computer in data transmission. An analogue to digital converter (ADC) program was implemented in the SCM to change the continuous analogue signals of the sensor into digital discrete signals and transmit those signals to a PC (the upper computer).

Table 2 Characteristics of eight sensors selected in our experiment. Sensor

Chemicals detected by the gas sensor

TGS813 TGS822

methane, propane, butane vapours of organic solvents, other volatile vapours, combustible gas (carbon monoxide) low concentrations of gaseous air contaminants (hydrogen, carbon monoxide) low concentrations of odorous gas (ammonia, hydrogen sulphide) volatile organic vapours combustible gas (e.g. propane) ammonia, oxynitride, aromatic compounds, and sulphide alcohol, aldehydes, ketones, and aromatic hydrocarbons

TGS2600 TGS2602 TGS2620 MQ-6 MQ-135 MQ-138

of tea quality. Here, we investigate the scientific basis of this manual method and quantify the aroma sensed by experts. Tea aromas mainly contain chemical substances such as alcohol, aldehydes, and aromatic hydrocarbons. We used a self-developed E-nose to perform the data acquisition, that is, gather the aroma data of tea samples. The core of the E-nose system is a multi-sensor array. A gas sensor is sensitive to certain types of gas. Nevertheless, most sensors have been developed for application to multiple scenarios [31,32]. Our team has performed prior research on sensor selection [33]. The eight sensors listed in Table 2 were chosen based on the main components of tea aroma and the correlation analysis method. We tested with different combinations of sensors and the results showed that these eight sensors are most suitable for detecting the aroma components of Longjing tea. The multisensor array in our E-nose is comprised of these eight gas sensors. Apart from the multi-sensor array, the major components of the Enose system shown in Fig. 1 are pump, exhaust fan, electromagnetic directional valve, drying pipe, and data transmission and processing system. Low gas flow rate facilitates the reaction between tea aroma and the multi-sensor array. Therefore, the gas flow rate of our pump was set to 2 L/min. We used a 60 mm × 60 mm × 20 mm exhaust fan to vent the experimental gases from the air cavity. QDH3102E high-frequency electromagnetic directional valve was used to switch between

3.3. Experimental design The main objective of this experiment is to simulate the process of identifying the quality level of tea by experts. The aroma is released from tea mainly by brewing. Without brewing, some aroma components will be missing and the concentrations of aromatic compounds will not be sufficiently high for the sensors to distinguish between the samples. Only professionally trained tea experts are qualified to assess tea quality in China. Therefore, we have made our best attempt to simulate the tea tasting by experts according to the Chinese Standard GBT237762018 [34]. Most researchers believe that sufficient information required for the detection of volatile compounds can be acquired in 60–70 s [10,11,27,29]. Accordingly, we attempted to find the contribution of each sensor to our classification results after 70 s. The details will be discussed in Section 4.3. The detailed procedure is as follows: Firstly, we take 20 g of tea from each group and divide it evenly into four portions (5 g each). Secondly, we use the four portions to prepare four cups of tea with 250 ml boiling water. Thirdly, we filter the tea after 5 min to drain the water from the tea leaves. Fourthly, we seal and wait for 45 min (indoor temperature is 25 ± 1 ℃ and indoor humidity Fig. 1. Photograph of the self-developed Enose system. Part A contains the multi-sensor array covered by a black air chamber. The exhaust fan is directly below the multi-sensor array. Part B is the electromagnetic directional valve used to switch the air channels (cleaning/collecting channels). Part C contains two pumps. Part D is the data processing system. Part E is the drying pipe.

3

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

Fig. 2. Response data curves of 8 gas sensors for SA.

We performed the experiment to observe the sensor response patterns of our e-nose system. Fig. 3 shows that Longjing green tea samples produce a certain pattern. However, the detailed response values are different for each sample. Therefore, we consider that each sensor contributes a quality attribute for the identification work. In conclusion, the e-nose system developed in this study is effective in classifying different Longjing tea samples.

is 80 ± 2%) to collect the volatile components of tea. Fifthly, we pump the tea aroma into the E-nose system. Sixthly, we read the response values of the 8 sensors every second for 80 s. Finally, the raw data should be 640-dimensional (80 s×8 sensors). 3.4. Data acquisition The eight gas sensors are responsible for detecting different aroma components. The raw data collected from the experiment comprise the voltage values of each sensor, which are between 0 (V) to 5 (V). The higher the value is, the more sensitive the sensor is to the aroma. Some sensors such as MQ-6 and MQ-135 in Fig. 2 are sensitive to Longjing tea aroma. The voltage values change over time because of variation in the gas flow, but the value would always remain within a certain interval. Each type of Longjing tea contains 8 sensors×80 s×4 cups of data. Considering the data for the seven types of samples, 17,920 (7 sample types×8 sensors×80 s×4 cups) voltage values were used as the original tea aroma dataset.

3.5. Detection method We tested three common machine learning methods to find the best detection method for data analysis. The results showed that random forest is best suited for this work. Section 4.1 shows the detailed results. Random forest is an ensemble learning technique, which can deal with high-dimensional data without feature extraction. Ensemble learning usually produces a series of individual weak learners and uses certain strategies to combine them. All results of the weak classifiers would be voted to obtain the result. The combination of the weak

Fig. 3. Sensor response (smellprint) patterns of all 8 sensors for each sample type. 4

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

the original tea aroma data D and generated 10 independent training/ test dataset splits, which implies a 10-fold cross validation. We considered the results of the eight sensors in each second as one unit of data and 20% data were used as the test set. Thus, we had 640 results for each sample. This research tested the following three common machine learning classification methods to perform the contrast experiments: RF (Random Forest), MLP (Multi-Layer Perceptron), and SVM (Support Vector Machine). Fig. 5 shows the confusion matrix for the results of the classification methods. The diagonal elements represent the number of data for which the predicted results are the same as that of the true category, whereas the off-diagonal elements are the incorrect testing results. Higher diagonal values of the confusion matrix indicate higher number of correct predictions. The figure shows that random forest has the best performance with accuracy of 99.42%, which means that it can provide a nearly precise result of Longjing tea classification. This research compared some similar studies discriminating teas with e-nose instruments. Table 3 shows their sensors, methods, and results. From the classification results, we can see that the performance of our system is relatively good.

classifiers forms a strong classifier, which can improve the generalisation performance of the classification algorithms. The fundamental unit of random forest is a base estimator and the most commonly used base estimator is the decision tree. In this work, we used the classification and regression tree (CART) as the weak classifier. The algorithm flow is as follows.

Firstly, we separate the original tea aroma dataset D into training sets and testing sets according to the time interval t. Secondly, we use the training sets to build decision trees, which are CART models in this study. The depth and number of decision trees in our model will be discussed in Section 4.2. The Gini coefficient is used to split the attributes for each CART, as follows:

4.2. Parameter tuning

The larger the Gini coefficient, the higher is the degree of sample uncertainty. Therefore, the smallest Gini coefficient is the optimal solution to split the attributes. Usually we split the attributes until all training samples belong to the same category. We can also set the threshold value to decide its depth. Thirdly, all CART models together compose the random forest structure. The quality attributes of our decision trees are discussed as feature importances in Section 4.3. Thus, the tea classification result is the voting result of all CART models.

We used a grid search method to perform the parameter tuning. During the experiments, we found that two main parameters should be mainly considered for Longjing tea classification model built in this research: the number of trees in the forest and the maximum depth of the tree. The number of trees is a parameter used in the bagging frame. It decides the maximum number of weak learning classifiers. As the value increases, the variance of the model decreases and the number of weak classifiers increases, which improves the accuracy of the model. Therefore, we sampled 19 equidistant values varying from 10 to 190. Fig. 6 shows the mean accuracy results of each value, where the highest mean accuracy corresponds to ‘80’ number of trees. The maximum depth of the tree is a parameter used in CART decision trees. It decides the maximum depth of each decision tree. Each decision tree treats data as nodes. Moreover, each node is split according to a certain attribute. Therefore, we can separate all data into different categories. The maximum depth decides the completion of the splitting task. Fig. 7 shows the growth trend. The accuracy approaches 1 at ‘15’, which is determined as the most suitable value.

4. Results and discussion

4.3. Effectiveness of sensors

A main objective of this study was to determine whether there are any differences between standard and non-standard classification rules for tea. Here, we explore the classification rules for different brands and compare different quality levels of the same brand as well as the same quality level of different brands for the seven Longjing tea samples. This study used python for the implementation. All programming tasks were performed on the scikit-learn framework [35]. The computer operating system was Windows 8 with Intel core i5 processor.

We tested the contribution of each sensor to our results. Our method treats each sensor as a feature. Section 3.2 explained the principle of the algorithm. We used the Gini coefficient to gather the statistics of each feature. All branch nodes of each feature have specific Gini coefficient values. The decrease in the Gini coefficient value can be regarded as the importance of each feature. During the classification task, all features (sensors) can be sorted by their importance. Fig. 8 shows the feature importance result of our tea classification model. As mentioned above, the 8 gas sensors represent 8 features: 1 (TGS813), 2 (TGS822), 3 (TGS2602), 4 (TGS2620), 5 (TGS2600), 6 (MQ-138), 7 (MQ-135), 8 (MQ-6). We sort the lists of each sample set as {SA, SB, SC}: 6 > 4 > 3 > 5 > 2 > 7 > 1 > 8 {N1A, N1B}: 3 > 5 > 1 > 4 > 6 > 7 > 8 > 2 {N2A, N2B}: 6 > 5 > 8 > 3 > 1 > 7 > 4 > 2. Different tea companies assign different importance rankings to gas sensors. The purpose of our research is to compare the ranking of the blind samples with our existing sample database and establish a reference for the identification of different Longjing tea brands. In our sample database, all sensors showed certain contribution to the classification result. Thus, our e-nose system was proven effective.

c

Gini(D) = 1 −

∑ pi2 i=1

where c is the number of sample types and pi is the probability of each type. If the CART treats type A as the node split, the corresponding children are DL and DR . The Gini coefficient after the split is

G(D, A) =

|DL | |D | Gini (DL) + R Gini (DR) |D| |D|

4.1. Comparison of classification methods We present the preliminary analysis of the tea aroma dataset in order to provide proof of the effectiveness and efficacy of the e-nose performance in assessing the tea quality. We conducted statistical PCA to compare different sample types. We separated our samples into two categories. We can see from Fig. 4 below that the standard Longjing tea sample sets can be roughly divided into 3 groups. However, there are overlaps among the non-standard Longjing tea sample sets. We did not consider the time factor t in the comparison of the classification methods owing to its generality. We randomly shuffled 5

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

Fig. 4. Principal component analysis (PCA) of 8-sensor output responses.

Fig. 5. Confusion matrix results of different methods.

the training set from the testing set. Moreover, the testing sets were blind samples. Fig. 9 shows the division of the data for each cup, where we separated 80 s of data into 80/t testing or training sets. The classification task was performed separately for each time slice. Figs. 10 and 11 show completely different results. In this experiment, we assigned the time factor t as 10 s and 20 s. It can be seen in Fig. 10 that our e-nose system can precisely distinguish standard

4.4. Time sequence of data We collected the aroma data in a time series. To obtain a more precise result, we added the time factor t to identify the contribution of data in different time slices to our classification result. Owing to the consideration of the time factor, the amount of input data became very small. We used the leave-one-out method to separate 6

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

Table 3 Results comparison of different researchers. Researchers

Tea Kind

Sensors

Methods&Results

Xu et al. [11]

Longjing Tea

MOS1,MOS2,MOS3,MOS4,MOS5,MOS6,MOS7,MOS8,MOS9,MOS10

Modak et al. [27] Borah et al. [36]

Indian Black Tea Unknown

TGS832,TGS823,TGS2600,TGS2610,TGS2611 TGS880,TGS826,TGS825,TGS822

Lelono et al. [37] Our research

Indonesia Black Tea Longjing Tea

MQ-7,TGS2600,TGS2602,TGS2620,TGS813,TGS822,TGS825,TGS826 TGS813,TGS822,TGS2602,TGS2620,TGS2600,MQ-138,MQ-135,MQ-6

Support Vector Machine (SVM): 91.67% Random Forest (RF): 88.89% Fuzzy based Response of Signal with Time (FRST): 90.11% Multi-Layer Perceptron (MLP): 90.77% Radial Basis Function (RBF): 92.31% Constructive Probabilistic Neural Network (CPNN): 93.85% Principal Component Analysis (PCA): 95.6% Our method: 99.8%

Fig. 9. Division of data into training set and testing set.

Fig. 6. Testing with different numbers of trees in the Longjing tea classification model.

whereas those of N1A, N1B, N2A, and N2B were similar to each other. We ascribed this conclusion to the unpredictability of the Longjing tea market. The time factor t is unlikely to have much impact on the results. In our future research, we may attempt to reduce the amount of computation. 4.5. Comparison among groups We designed a series of experiments to distinguish standard certified tea from non-standard certified tea. We used the standard classification rule to grade the non-standard certified sets. We considered the standard certified sets as the training set and the non-standard certified sets as the blind testing sets. Table 4 shows the classification results. Each row in Table 4 contains the prediction probabilities of the non-standard set. The results of all non-standard sets are similar to those of SC. Thus, it is proven that the standard classification rule is suitable for nonstandard sets. The data for each set can be assigned a prediction level according to the standard classification rule. For detailed experimental comparison, we separated all tea data into 13 subsets according to their quality level and brand. All 13 subsets can be divided into 3 main categories. We tested with t = 10 and t = 20 to obtain more comprehensive experimental results. Table 5 shows the

Fig. 7. Testing with different maximum depths of each tree in CART.

certified Longjing tea samples. Results from all time slices were above 95%. Fig. 11 shows a large fluctuation in the accuracy. Most accuracy results of the non-standard certified sets are lower than 60%. We have proven above that the data pertaining to four cups of tea is sufficient to obtain high identification accuracy. The e-nose system developed in this study can identify the quality level successfully. In our database, the data for SA, SB, and SC were very different from each other

Fig. 8. Feature importance results for different datasets. 7

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

Fig. 10. Classification accuracy of standard certified set (SA, SB, SC).

classification results of each set. The higher the number, the greater is the difference between the tea samples. The first category is the comparison between the brands. We tested all three brands to find whether their classification rules are different from each other. The accuracy for only the standard set was higher than 0.9. The accuracy of the non-standard set of the first brand (N1A, N1B) was slightly higher than that of the second brand (N2A, N2B). However, the result was poorer when two non-standard brands were combined. We have proved earlier that the non-standard sets are similar in results to level C of the standard sets. Therefore, we added one test set (SC) to demonstrate their classification accuracy. The result was lower than 0.6, which proves SC, N1A, N1B, N2A, N2B are similar to each other. The second and third main categories compare the same quality level of tea belonging to different brands. The accuracy was always 1.0 when the standard sets were compared separately. However, the accuracy declined significantly when we considered different brands of the same level in the non-standard sets. These results prove that nonstandard sets have no strict rules for tea grading. In conclusion, certification methods for non-standard tea may be unreliable and misleading

Table 4 Results of using standard certification method to classify non-standard certified set.

N1A N1B N2A N2B

SA

SB

SC

0.1625 0.1750 0.1625 0.1625

0.1125 0.1187 0.1125 0.1125

0.7250 0.7062 0.7250 0.7250

in our study. The classification results of our method showed good agreement with the expected results. Secondly, the contribution of each sensor was analysed from the perspective of hardware. It was observed that each sensor contributed to the classification results. From the perspective of data analysis, we performed cross contrast experiments among the Longjing tea samples of different brands and quality levels. In addition, we considered the time factor of data in the experiments. Results showed that standard Longjing tea samples were graded precisely and the time factor had little contribution to the results. Thirdly, The testing results of the blind samples using our Longjing tea database can reach over 95%. Our classification method can be used to build a more comprehensive Longjing tea database. The proposed method has positive implications for the standardisation of the Longjing tea market. Longjing tea samples from different companies have very different characteristics. Each company has its own classification rules. However, only the standard certified rule was proven reliable. The standard classification rule can be used to grade non-standard sets. In conclusion, it is possible to use a certain standard to regulate tea grading based on the tea aroma alone. In our future work, we plan to research the

5. Conclusions & future work In this paper, we proposed a system to distinguish between different brands of Longjing tea and identify their quality levels effectively. We verified the effectiveness and efficacy of our e-nose system based on three aspects: method analysis, experimental verification and practical application. Firstly, we compared different classification methods such as MLP and SVM. The results showed that our method has better identification performance. We created a database for the Longjing tea samples used

Fig. 11. Classification accuracy of non-standard certified set (N1A, N1B, N2A, N2B). 8

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

Table 5 Accuracy of different datasets. Dataset

{SA, SB, SC} {N1A, N1B} {N2A, N2B} {N1A, N1B, N2A, N2B} {SC, N1A, N1B, N2A, N2B} {SA, N1A, N2A} {SA, N1A} {SA, N2A} {N1A, N2A} {SB, N1B, N2B} {SB, N1B} {SB, N2B} {N1B, N2B}

Time interval t = 10

Time interval t = 20

Average

1s-10s

11s-20s

21s-30s

31s-40s

41s-50s

51s-60s

61s-70s

71s-80s

1s-20s

21s-40s

41s-60s

61s-80s

0.992 0.625 0.500 0.400 0.490 0.767 1.000 1.000 0.650 0.783 1.000 1.000 0.650

1.000 0.625 0.563 0.456 0.529 0.758 1.000 1.000 0.638 0.758 1.000 1.000 0.650

1.000 0.625 0.563 0.381 0.509 0.758 1.000 1.000 0.625 0.783 1.000 1.000 0.675

1.000 0.625 0.325 0.313 0.419 0.750 1.000 1.000 0.625 0.783 1.000 1.000 0.675

1.000 0.625 0.375 0.250 0.484 0.758 1.000 1.000 0.663 0.900 1.000 1.000 0.750

1.000 0.738 0.375 0.269 0.479 0.725 1.000 1.000 0.575 0.742 1.000 1.000 0.625

1.000 0.813 0.450 0.375 0.530 0.692 1.000 1.000 0.563 0.750 1.000 1.000 0.625

0.992 0.863 0.375 0.406 0.505 0.717 1.000 1.000 0.563 0.767 1.000 1.000 0.650

1.000 0.625 0.613 0.453 0.531 0.763 1.000 1.000 0.650 0.763 1.000 1.000 0.631

0.996 0.681 0.581 0.488 0.536 0.808 1.000 1.000 0.725 0.813 1.000 1.000 0.725

1.000 0.731 0.363 0.406 0.520 0.775 1.000 1.000 0.644 0.838 1.000 1.000 0.875

1.000 0.819 0.519 0.456 0.518 0.692 1.000 1.000 0.563 0.842 1.000 1.000 0.763

relationships between gas sensors and other varieties of green tea. Moreover, we intend to use the results from this research to build a universal detector for green tea. All of co-authors (Xiaohui Lu, Jin Wang, Guodong Lu, Bo Lin, Meizhuo Chang, Wei He) declare that they have no conflict of interest.

[11]

Declaration of Competing Interest

[12]

All of co-authors (Xiaohui Lu, Jin Wang, Guodong Lu, Bo Lin, Meizhuo Chang, Wei He) declare that they have no conflict of interest. [13]

Acknowledgements This work was supported by the Key R&D Program of Zhejiang Province [grant numbers 2017C02007] and Robotics Institute of Zhejiang University [grant numbers K18-508116-001].

[14]

[15]

References

[16]

[1] China agricultural brand research center, 2015 中国茶叶区域公用品牌价值评估报告 (Evaluation report of reginal public brand value of Chinese tea)(in Chinese), China Acad. J. Electron. Publ. House. 37 (2015) 4–11. http://kns.cnki.net/kns/detail/ detail.aspx?FileName=CAYA201506003&DbName=CJFQ2015. [2] K. Wang, J. Ruan, Analysis of chemical components in green tea in relation with perceived quality, a case study with Longjing teas, Int. J. Food Sci. Technol. 44 (2009) 2476–2484, https://doi.org/10.1111/j.1365-2621.2009.02040.x. [3] J. Li, H. Yuan, Y. Yao, J. Hua, Y. Yang, C. Dong, Y. Deng, J. Wang, H. Li, Y. Jiang, Q. Zhou, Rapid volatiles fingerprinting by dopant-assisted positive photoionization ion mobility spectrometry for discrimination and characterization of Green Tea aromas, Talanta 191 (2019) 39–45, https://doi.org/10.1016/j.talanta.2018.08. 039. [4] X. Gong, Y. Han, J. Zhu, L. Hong, D. Zhu, J. Liu, X. Zhang, Y. Niu, Z. Xiao, Identification of the aroma-active compounds in Longjing tea characterized by odor activity value(OAV), gas chromatography-olfactometry (GC-O) and aroma recombination, Int. J. Food Prop. 20 (sup1) (2017) S1107–S1121, https://doi.org/10. 1080/10942912.2017.1336719. [5] Y. Zhu, T. Yang, J. Shi, F. Yu, W. Dai, J. Tan, L. Guo, Y. Zhang, Q. Peng, H. LÜ, Z. Lin, 西湖龙井茶香气成分的全二维气相色谱-飞行时间质谱分析(Analysis of aroma components in Xihu Longjing Tea by comprehensive two-dimensional gas chromatography-time-of-Flight mass spectrometry) (in Chinese with English abstract), Sci. Agric. Sin. 48 (2015) 4120–4146, https://doi.org/10.3864/j.issn.0578-1752. 2015.20.013. [6] Y. Yang, H. Yin, H. Yuan, Y. Jiang, C. Dong, Y. Deng, Characterization of the volatile components in green tea by IRAE-HS-SPME/GC-MS combined with multivariate analysis, PLoS One 13 (2018) e0193393, , https://doi.org/10.1371/journal.pone. 0193393. [7] X. Guo, C.-T. Ho, W. Schwab, C. Song, X. Wan, Aroma compositions of large-leaf yellow tea and potential effect of theanine on volatile formation in tea, Food Chem. 280 (2019) 73–82, https://doi.org/10.1016/j.foodchem.2018.12.066. [8] Z. Feng, Y. Li, M. Li, Y. Wang, L. Zhang, X. Wan, X. Yang, Tea aroma formation from six model manufacturing processes, Food Chem. 285 (2019) 347–354, https://doi. org/10.1016/j.foodchem.2019.01.174. [9] A.D. Wilson, D.G. Lester, C.S. Oberle, Application of conductive polymer analysis for wood and woody plant identifications, For. Ecol. Manag. 209 (2005) 207–224, https://doi.org/10.1016/j.foreco.2005.01.030. [10] Z. Yang, F. Dong, K. Shimizu, T. Kinoshita, M. Kanamori, Identification of

[17]

[18] [19] [20] [21]

[22]

[23]

[24]

[25]

[26]

[27]

9

0.998 0.700 0.467 0.388 0.504 0.747 1.000 1.000 0.624 0.794 1.000 1.000 0.691

coumarin-enriched Japanese green teas and their particular flavor using electronic nose, J. Food Eng. 92 (2009) 312–316, https://doi.org/10.1016/j.jfoodeng.2008. 11.014. M. Xu, J. Wang, The qualitative and quantitative assessment of tea quality based on E-nose, E-tongue and E-eye signals combining with chemometrics methods, in: 2018 Detroit, Michigan July 29 - August 1, 2018, American Society of Agricultural and Biological Engineers, St. Joseph, MI, 2018: pp. 1–9. doi:https://doi.org/10. 13031/aim.201800610. P. Mousumi, T. Bipan, D. Pallab Kumar, D. Ankur, J. Arun, R. Jayanta Kumar, B. Nabarun, R. Bandyopadhyay, C. Anutosh, Classification of black tea taste and correlation with tea taster’s mark using voltammetric electronic tongue, IEEE trans, Instrum. Meas. 59 (2009) 2230–2239, https://doi.org/10.1109/TIM.2009. 2032883. W. He, X. Hu, L. Zhao, X. Liao, Y. Zhang, M. Zhang, J. Wu, Evaluation of Chinese tea by the electronic tongue : correlation with sensory properties and classification according to geographical origin and grade level, Food Res. Int. 42 (2009) 1462–1467, https://doi.org/10.1016/j.foodres.2009.08.008. H. Xiao, J. Wang, Discrimination of Xihulongjing tea grade using an electronic tongue, Afr. J. Biotechnol. 8 (2009) 6985–6992, https://doi.org/10.4314/ajb. v8i24.68785. R. Dutta, K. Kashwan, M. Bhuyan, E. Hines, J. Gardner, Electronic nose based tea quality standardization, Neural Netw. 16 (2003) 847–853, https://doi.org/10. 1016/S0893-6080(03)00092-3. A. Wilson, M. Baietto, Applications and advances in electronic-nose technologies, Sensors 9 (2009) 5099–5148, https://doi.org/10.3390/s90705099. K. Arshak, E. Moore, G.M. Lyons, J. Harris, S. Clifford, A review of gas sensors employed in electronic nose applications, Sens. Rev. 24 (2004) 181–198, https:// doi.org/10.1108/02602280410525977. H. Bai, G. Shi, Gas sensors based on conducting polymers, Sensors. 7 (2007) 267–307, https://doi.org/10.3390/s7030267. N. Barsan, U. Weimar, Conduction model of metal oxide gas sensors, J. Electroceramics. 7 (2001) 143–167, https://doi.org/10.1023/A:1014405811371. N. Yamazoe, G. Sakai, K. Shimanoe, Oxide semiconductor gas sensors, Catal. Surv. from Asia. 7 (2003) 63–75, https://doi.org/10.1023/A:1023436725457. R.B. Roy, S. Mondal, B. Tudu, R. Bandyopadhyay, N. Bhattacharyya, Improved classification of black tea employing feature level fusion of electronic nose and tongue responses, Proc. 2014 Int. Conf. Control. Instrumentation, Energy Commun, IEEE, 2014, pp. 166–170, , https://doi.org/10.1109/CIEC.2014.6959071. S. Ghosh, N. Bhattacharyya, B. Tudu, R. Bandyopadhyay, Electronic nose for on-line quality evaluation of black tea using incremental SOM techniques, 2015 2nd Int. Symp. Phys. Technol. Sensors, IEEE, 2015, pp. 273–277, , https://doi.org/10.1109/ ISPTS.2015.7220128. A. Modak, R. Banerjee(Roy), B. Tudu, R. Bandyopadhyay, N. Bhattacharyya, Towards artificial flavor perception of black tea: an approach using fusion of electronic nose and electronic tongue responses with fuzzy ART classification technique, Proc. 2Nd Int. Conf. Percept. Mach. Intell. (2015) 246–251, https://doi. org/10.1145/2708463.2709040. A. Modak, R.B. Roy, B. Tudu, R. Bandyopadhyay, N. Bhattacharyya, A novel fuzzy based signal analysis technique in electronic nose and electronic tongue for black tea quality analysis, 2016 IEEE First Int. Conf. Control. Meas. Instrum. IEEE, 2016, pp. 279–283, , https://doi.org/10.1109/CMI.2016.7413755. P. Saha, S. Ghorai, B. Tudu, R. Bandyopadhyay, N. Bhattacharyya, Multi-class support vector machine for quality estimation of black tea using electronic nose, 2012 Sixth Int. Conf. Sens. Technol. IEEE, 2012, pp. 571–576, , https://doi.org/10. 1109/ICSensT.2012.6461744. R. Banerjee, P. Chattopadhyay, R. Rani, B. Tudu, R. Bandyopadhyay, N. Bhattacharyya, Discrimination of black tea using electronic nose and electronic tongue: a Bayesian classifier approach, 2011 Int. Conf. Recent Trends Inf. Syst. IEEE, 2011, pp. 13–17, , https://doi.org/10.1109/ReTIS.2011.6146832. R. Banerjee, N.S. Khan, S. Mondal, B. Tudu, R. Bandyopadhyay, N. Bhattacharyya, Features extraction from electronic nose employing genetic algorithm for black tea quality estimation, 2013 Int. Conf. Adv. Electron. Syst. IEEE, 2013, pp. 64–67, ,

Sensors & Actuators: B. Chemical 301 (2019) 127056

X. Lu, et al.

based on quality by using electronic nose and principal component analysis, AIP Conf. Proc. (2016) 020003, https://doi.org/10.1063/1.4958468.

https://doi.org/10.1109/ICAES.2013.6659362. [28] H. Yu, J. Wang, C. Yao, H. Zhang, Y. Yu, Quality grade identification of green tea using E-nose by CA and ANN, LWT - Food Sci. Technol. 41 (2008) 1268–1273, https://doi.org/10.1016/j.lwt.2007.08.018. [29] H. Yu, J. Wang, H. Zhang, Y. Yu, C. Yao, Identification of green tea grade using different feature of response signal from E-nose sensors, Sensors Actuators B Chem. 128 (2008) 455–461, https://doi.org/10.1016/j.snb.2007.07.048. [30] H. Yu, J. Wang, H. Xiao, M. Liu, Quality grade identification of green tea using the eigenvalues of PCA based on the E-nose signals, Sens. Actuators B Chem. 140 (2009) 378–382, https://doi.org/10.1016/j.snb.2009.05.008. [31] A.D. Wilson, Diverse applications of electronic-nose technologies in agriculture and forestry, Sensors 13 (2013) 2295–2348, https://doi.org/10.3390/s130202295. [32] M. Baietto, A.D. Wilson, Electronic-Nose Applications for Fruit Identification, Ripeness and Quality Grading, (2015), https://doi.org/10.3390/s150100899. [33] Z. Sang, 便携性茶叶香气检测电子鼻关键技术研究 (Research on Key Technology of Portable Tea Aroma Electronic Nose) (in Chinese with English abstract), Zhejiang University, 2017, http://cdmd.cnki.com.cn/Article/CDMD-10335-1017047100. htm. [34] General Administration of Quality Supervision Inspection and Quarantine of the People’s Republic China(AQSIQ), China National Standardization Management Committee, GBT23776-2018 Methodology for Sensory Evaluation of Tea (in Chinese), China Standards Press, 2018. [35] P. Fabian, V. Gaël, G. Alexandre, M. Vincent, T. Bertrand, G. Olivier, B. Mathieu, M. Andreas, N. Joel, L. Gilles, P. Peter, W. Ron, D. Vincent, V. Jake, P. Alexandre, C. David, B. Matthieu, P. Matthieu, D. Édouard, Scikit-learn: machine learning in Python, J. Mach. Learn. Res. 12 (2011) 2825–2830 https://arxiv.org/abs/1201. 0490. [36] S. Borah, E.L. Hines, M.S. Leeson, D.D. Iliescu, M. Bhuyan, J.W. Gardner, Neural network based electronic nose for classification of tea aroma, Sens. Instrum. Food Qual. Saf. 2 (2008) 7–14, https://doi.org/10.1007/s11694-007-9028-7. [37] D. Lelono, K. Triyana, S. Hartati, J.E. Istiyanto, Classification of Indonesia black teas

Xiaohui Lu received her B.S. degree in 2015 from Zhejiang University of Technology. She is currently pursuing his Ph.D. in the School of Mechanical Engineering, Zhejiang University, China. Her research interests include electronic engineering, machine learning and computer vision. Jin Wang is an associate professor at the School of Mechanical and Engineering of Zhejiang University, Hangzhou, China. He received his B.S. degree in Mechatronic Engineering in 2003 and Ph.D. degree in Mechanical Engineering in 2008, both from Zhejiang University. His research interests include CAD/CAM, design automation and optimization, geometric modeling, reverse engineering and computer graphics. Guodong Lu is a professor of the School of Mechanical Engineering, Zhejiang University, Hangzhou, China. He received his B.S., M.S. and Ph.D. degree from the Zhejiang University in 1983, 1990 and 2000, respectively. His research interests include CAD/ CAM in soft-products industry, 3D reconstruction from 2D engineering drawings. Bo Lin received his B.S. degree in 2015 from Zhejiang University of Technology. He is currently pursuing his Ph.D. in the College of Computer Science, Zhejiang University, China. His research interests include data mining, machine learning and computer vision in the medical field. Meizhuo Chang received her M.S. degree in 2019 from Zhejiang University of Technology. Wei He is currently pursuing his M.S. in the School of Mechanical Engineering, Zhejiang University, China.

10