Accepted Manuscript Title: Chemical Profiling of Floral and Chestnut Honey using High-Performance Liquid Chromatography-Ultraviolet Detection Authors: Ahmet Kemal Aloglu, Peter de B. Harrington, Saliha Sahin, Cevdet Demir, Mesut Ertan Gunes PII: DOI: Reference:
S0889-1575(17)30140-0 http://dx.doi.org/doi:10.1016/j.jfca.2017.06.002 YJFCA 2905
To appear in: Received date: Revised date: Accepted date:
15-3-2017 16-5-2017 2-6-2017
Please cite this article as: Aloglu, Ahmet Kemal., Harrington, Peter de B., Sahin, Saliha., Demir, Cevdet., & Gunes, Mesut Ertan., Chemical Profiling of Floral and Chestnut Honey using High-Performance Liquid Chromatography-Ultraviolet Detection.Journal of Food Composition and Analysis http://dx.doi.org/10.1016/j.jfca.2017.06.002 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Page 1 of 35
Original Research Article Chemical Profiling of Floral and Chestnut Honey using High-Performance Liquid Chromatography-Ultraviolet Detection Ahmet Kemal Aloglua, Peter de B. Harringtona,*, Saliha Sahinb, Cevdet Demirb, Mesut Ertan Gunesc Center for Intelligent Instrumentation, Clippinger Laboratories, Department of Chemistry and Biochemistry, Ohio University, Athens, Ohio 45701, United States a
Department of Chemistry, Faculty of Science and Arts, University of Uludag, Bursa 16059 Turkey b
Vocational School of Technical Sciences, University of Uludag, Bursa 16059 Turkey *Author to whom correspondence should be addressed; E-Mail: c
[email protected]; Tel.: +1-740-994-0265; Fax: +1-740-5930148.
Page 2 of 35
Highlights
Phenolic compounds can be used as markers for specific honey types. HPLC-UV analysis provides two-way data to improve classification. FuRES and SVMTreeG provided the better classification rates for type of honey. Data preprocessing can improve the classification rates. Honey source location may be determined by chemical profiling
Abstract Using the two-way images of phenolic compounds from highperformance liquid chromatography-ultraviolet diode array detection (HPLCDAD), floral and chestnut honey from Turkey were successfully differentiated. A fuzzy rule-building expert system (FuRES), support vector machine classification tree (SVMTreeG), and super partial least-square discriminant analysis were used to develop classification models. Normalization, retention time alignment, square root transform, and dissimilarity kernel were evaluated as data preprocessing methods. The bootstrapped Latin partition was used with 100 bootstraps and 4 partitions. Classification rates of FuRES and SVMTreeG with a square root transform were 97.6±0.4% and 97.6±0.4% for classifying the type of honey, respectively. The measures of precision are 95% confidence intervals. HPLC-DAD was demonstrated as a reliable analytical method for authentication of honey.
Page 3 of 35
Keywords
Chestnut honey, floral honey, Food analysis, Food composition,
HPLC-DAD, classification, phenolic compounds, chemometrics, FuRES, SVMTreeG.
Page 4 of 35
1
Introduction
Nutritional and medicinal properties of honey make it a very valuable natural product. Produced by honeybees (Apis mellifera) from a variety of pollens and nectars, different flowers may influence not just the flavor but the medicinal properties of the honey. Reviews by Eteraf-Oskouei and Najafi (Eteraf-Oskouei and Najafi, 2013) and by Molan (Molan, 1999) on the medicinal use of honey report several treatments for some human diseases, such as different types of wound healing, gastrointestinal tract diseases, cardiovascular disease, and various ophthalmological conditions, and also honey usage as remedies from ancient Egyptian and Greek cultures. There has been an increased interest to study and evaluate honey ingredients and published research is available based on vitamin (Ciulu et al., 2011), amino acid (Kivrak, 2015), organic (Daniele et al., 2012), and mineral (Kucuk et al., 2007; Vanhanen et al., 2011) contents. Antioxidant activity has also been studied (Al-Mamary et al., 2002; Aljadi and Kamaruddin, 2004; Alvarez-Suarez et al., 2012). Although honey is mainly composed of carbohydrates such as fructose and glucose (Bogdanov et al., 2008), specific phenols (gallic, syringic, and caffeic acids, along with luteolin, galangin, and apigenin) also have been reported (Can et al., 2015; Marshall et al., 2014; Perna et al., 2013). , Several analytical methods have been used for the analysis of honey to determine the botanical origins. Some of the techniques used for analysis
Page 5 of 35
are gas chromatography/mass spectrometry (GC/MS) for aromatic compounds (Mateo and Bosch-Reig, 1998); ion chromatography (IC) (Anupama et al., 2003) and high performance liquid chromatography (HPLC) for flavonoids, amino acids, sugars (Anklam, 1998), and phenolic acids (Andrade et al., 1997); and near infrared (NIR) and Raman spectroscopy to determine contamination (Bertelli et al., 2007). The relatively more expensive commercial value of unifloral honey than polyfloral honey may draw some attention toward adulteration in honey market. Thus, fundamental analysis and classification to determine the value of honey based on its ingredients avoid fraud and protect honey consumers around the world. Therefore, it is important to be able to classify honey based on botanical origins and chemical properties. However, the data collected from analytical techniques may not explicitly give the desired solution to make a classification, and multivariate statistical techniques will play a crucial role in classifying the samples based on chemical properties and geographical regions. Extraction of chemical information from a complex data set can be achieved by multivariate analysis using mathematical and statistical methods to evaluate the samples altogether. Chemometrics techniques applied to chemical data collected from a variety of analytical instruments have been proven as very useful tools for authentication and classification purposes, thus, chemometrics has found its place in the literature (Danezis et al., 2016; Karoui and De Baerdemaeker,
Page 6 of 35
2007). Some of the classifiers found in the literature are the fuzzy rulebuilding expert system (FuRES) (HARRINGTON, 1991) and partial least square discriminant analysis (PLS-DA) (Barker and Rayens, 2003). However, the support vector machines (SVM) (Harrington, 2015) classification tree, a newly developed classification method by Harrington in 2015 has not been used before. Therefore, these classification techniques will be used in this research to classify honey samples, and a brief introduction to each of these methods will be given in the theory section. FuRES (HARRINGTON, 1991) is a classification method that builds a classification tree consisting of multivariate fuzzy logistic units. Examples of FuRES applications are matrix-assisted laser desorption/ionization mass spectroscopy (MALDI-MS) for proteomic analysis of amniotic fluids (Harrington et al., 2006), HPLC-electrospray ionization mass spectrometry (HPLC-EIMS) for classification of cultivation locations of Panax quinquefolius L samples from United States and China (Sun et al., 2012), GC-differential mobility spectroscopy (GC-DMS) for classification of the two-way gas chromatograms of a variety of fuel types (Rearden et al., 2007). PLS-DA is another classification method that is a case of PLS in which to find a relationship between the X matrix as a predictor and the Y matrix as a response. However, PLS-DA may use a Y matrix of binary variables defining the different classes, i.e. floral and chestnut honey, found in X matrix (Barker and Rayens, 2003). This classification method successfully classified
Page 7 of 35
chemical data of organically and conventionally produced basils by GC/MS (Wang et al., 2013), spectral fingerprints of Panax species by ultraviolet (UV), NIR, and MS spectrometry (Chen et al., 2011), and polychlorinated biphenyls commercial mixtures by GC/MS integrated with solid phase microextraction (Zhang and Harrington, 2013). PLS-DA algorithm was revised to automatically make a determination of the optimum number of latent variables using an internal bootstrap Latin partition, and is referred to as super PLS-DA (sPLS-DA) (Aloglu et al., 2016). An additional classification method is SVM invented by Cortes and Vapnik in 1995 (CORTES and VAPNIK, 1995). The theory behind the SVM is to find a hyperplane that will give the largest margin to separate two different classes (Boser et al., 1992; CORTES and VAPNIK, 1995). This simple idea has found usage in analytical chemistry such as in detection of endometrial cancers from tissue sections with NIR (Zhang et al., 2011), in analysis of ink and pigment for forensic purpose with combined Raman spectroscopy, in laserinduced breakdown spectroscopy (LIBS) (Hoehse et al., 2012), and in the clinical risk assessment of patients with chronic kidney disease (Chen et al., 2016). These classification methods, FuRES, sPLS-DA, and SVMTreeG, were used for evaluations of the two-way chromatographic-spectral images from HPLCUV data of floral and chestnut honey while BLPs were performed. The twoway chromatographic-spectral images were used to build classifiers while
Page 8 of 35
normalization, retention time alignment, square root transform, and dissimilarity kernel of the data were processed to compare the classification rates based on the type of honey and geographical origins of honey.
Page 9 of 35
2
Theory
2.1 Data Preprocessing Data preprocessing used in this research to compare the effects on the classification rates are normalization, retention time (RT) alignment, square root transform, and dissimilarity kernel. Other preprocessing methods were evaluated, for example, the log transform but it was too strong and resulted in poor signal to noise ratios and classification performance. Normalization is performed to give equal weight to each data object. Because there are some systematic variations in the data occurring because of different amounts of samples from different extractions or injections, these variations need to be corrected. The data is normalized by dividing the square root of the sum of the squared variables as given in Equation 1. 𝑥𝑛𝑖,𝑗 =
𝑥𝑖,𝑗 2 √∑𝑛 𝑗=1 𝑥𝑖,𝑗
(1)
Measured intensity 𝑥𝑖,𝑗 of object i and measurement j are comprised of a data matrix with m rows of the objects and n columns of the measurements, and normalized intensity of the data will be 𝑥𝑛 with unit vector length. The chromatographic peaks were aligned with respect to retention time. Aligning the peaks in the chromatograms may reduce the classification rate, because the chromatograms will be more similar after alignment. The UV spectral way of the two-way object assists the alignment algorithm in achieving correct matching of peaks in the same retention time window. The
Page 10 of 35
alignment algorithm used a third-order polynomial mapping of retention times to maximize the correlation with the average two-way object for the dataset. Other polynomial orders were also evaluated, but an acceptable alignment was achieved by a third-order polynomial and the higher order polynomials did not show any improvement. The ‘pchip’ spline, a MATLAB function, was used for interpolation of intensities. The polynomial for mapping the alignment is given in Equation 2 𝑡 ∗ = 𝑏0 + 𝑏1 𝑡 + 𝑏2 𝑡 2 + 𝑏3 𝑡 3 (2) for which 𝑡 is the unaligned data, and 𝑡 ∗ is the new time used for interpolation of the two-way object. The corrected peaks after alignment can be seen in Figure 1. The detailed and zoomed-in range between 20 and 30 min and the range between 30 and 40 min demonstrate that the alignment worked well in the Supplementary Materials (Figure S1). The square root transform of the absorbance in the two-way chromatographic-spectral images was performed to decrease the dynamic range by reducing the higher peaks and inflate the relative importance of the smaller peaks of the two-way data. This method provided significant improvement for some cases and did not result in a serious loss of performance for the other cases, therefore, it was applied uniformly for all four classification approaches. A dissimilarity transformation, squared
Page 11 of 35
Euclidean, was evaluated that finds the distances between each pair of objects and is given by Equation 3 𝑑𝑖,𝑗 = ∑𝑛𝑘=1(𝑥𝑖𝑘 − 𝑥𝑗𝑘 )2
(3)
For which 𝑑𝑖,𝑗 is the dissimilarity of the data from the calculated distances between i and j objects and m objects by n columns were presented in an m × m matrix (Zerzucha et al., 2012; Zerzucha and Walczak, 2012). 2.2 Bootstrapped Latin Partition Before a brief explanation of each classification methods is given, the bootstrapped Latin partition (BLP) (Harrington, 2006) is explained. Bootstrapping is a resampling technique that was developed for expensive and precious samples (Grunkemeier and Wu, 2004). It works by random sampling with a replacement that allows for statistical estimates with confidence intervals. The Latin partition method divides the data randomly into subsets with a constraint of maintaining the same class distribution. An advantage of the Latin partition method is that prediction objects are used once and only once for validation. The ability of randomly generating the Latin partitions gives the advantage to be bootstrapped many times for increasing the statistical power of the calculations. The total number of bootstrap used in this research is 100 with 4 Latin partitions.
Page 12 of 35
2.3 Fuzzy Rule-Building Expert System FuRES (HARRINGTON, 1991) is one of the chemometrics pattern recognition methods, that builds a classification tree consisting of neural network processing units. The FuRES algorithm projects data from a multidimensional space onto a normalized weight vector to produce scalar scores that are then used for calculating the fuzzy entropy of the classification. The weight vector partitions the data into two subsets that are the branches of a classification tree. A divide and conquer algorithm is implemented until all the objects of the same class are grouped at the leaves of the tree. 2.4 Support Vector Machine Classification Tree SVM is initially proposed in 1992 (Boser et al., 1992; CORTES and VAPNIK, 1995) and as a binary linear classifier, SVMs bipolarly encode classes as either -1 or +1. They determine a weight vector that maximizes the margin about a separating hyperplane between two classes of objects in the data space. An SVM classification tree (Harrington, 2015), SVMTreeG, is a recently developed method that determines the encoding based on the distribution of the data. This approach makes the difference from original SVM that uses the permutations of the classes on finding the binary encoding. Predictions are based on equation 4. ŷ𝑖 = 𝒙𝒊 𝒘 + 𝑏 (4)
Page 13 of 35
The predicted class ŷ𝑖 of object 𝒙𝒊 is defined by a weight vector 𝒘 that is orthogonal to the hyperplane which separates the objects in the data space. Each object 𝒙𝒊 is a row vector with n measurements or variables. The SVMTreeG uses two approaches, variance based on principal component analysis and covariance based on PLS to map the multiclass binary encoding onto a bipolar encoding. Details can be found in the reference (Harrington, 2015). After building the SVM models, the most efficient and the lowest entropy of classification is chosen for classifying the samples based on the branches of the tree. Because the SVM method operates on a kernel and uses quadratic programming, it is faster than the FuRES method.
Page 14 of 35
2.5 Super Partial Least Square-Discriminant Analysis The PLS-DA algorithm and model building for prediction are clearly explained in the literature (Brereton and Lloyd, 2014). However, the simple idea of PLS-DA is that PLS-DA uses Y matrix as a set of binary variables defining the classes of the objects in X. Super PLS-DA (sPLS-DA) is an automated version of PLS-DA by using BLPs to optimize the number of latent variables by reducing the average bootstrapped prediction errors (Aloglu et al., 2016; Harrington et al., 2009). 3
Materials and Methods
3.1 Sample Collections and Extraction of Phenolic Compounds Honey samples were collected directly from apiculturists in different geographical regions of Turkey in 2011-2012. Two type of honey were evaluated. Thirty samples of chestnut honey and twenty-six samples of floral honey collected from different regions of Turkey. The regions from where the honey samples were collected are Marmara (twenty-tree chestnut and seventeen floral), Central Anatolia (two chestnuts and one floral), Eastern Anatolia (seven floral), Black Sea (four chestnuts and one floral), Mediterranean (one chestnut). Each honey sample was analyzed twice to furnish replicates. Honey samples comprise phenolic compounds that will be the core of differentiating chestnut and floral honey. Luteolin is only observed in floral honey while ellagic acid, genkwanin, and trans-ferulic acid are only presented in chestnut honey samples. The procedure for extraction
Page 15 of 35
of the phenolic compounds is given in detail in the reference (Güneş et al., 2016). The extracted phenolic compounds were then used for further analysis by HPLC-DAD. 3.2 Chemicals The chemicals purchased from Merck (Darmstadt, Germany) were hydrochloric acid (analytical grade), methanol (HPLC grade) and formic acid (HPLC grade). To confirm the phenolic compounds in honey samples, the standards purchased from Sigma (St. Louis, MO, USA) are chrysin, phydroxybenzoic acid, gallic acid, protocatechuic acid, vanillic acid, syringic acid, trans-3-hydroxycinnamic acid, hesperetin, 2-hydroxycinnamic acid, salicylic acid, pinocembrin, caff eic acid, p-coumaric acid, trans-ferulic acid, ferulic acid, caff eic acid phenylethyl ester, trans-chalcone, vitexin, hyperoside, rutin, ellagic acid, myricetin, quercitrin, quercetin, luteolin, kaempferol, apigenin, isorhamnetin, galangin, genkwanin. Methanol was used for the preparation of all standard solutions. The chromatograms of these standard phenolic compounds were reported in the reference (Güneş et al., 2016). 3.3 Data Collections Once phenolic compounds were extracted from honey samples, they were stored in the dark at 4 oC before HPLC-UV analysis. Phenolic compounds from phenolic fractions were determined by HPLC with UV detector that is an Agilent 1200 HPLC system (Waldbronn, Germany). An XBridge C18 (4.6 ×
Page 16 of 35
250 mm, 3.5 mm) column from Waters (Milford, MA, USA) was used for chromatographic separations. Experimental and gradient conditions are given in the reference (Güneş et al., 2016). Compounds from each phenolic fraction were measured twice by HPLC-DAD to furnish two replicates for each honey sample. An example of the two-way chromatographic and spectral image of a floral honey is given in Figure 2. A total of 112 two-way chromatographic and UV spectral images have been further used for classification purposes with chemometrics methods.
Page 17 of 35
4
Experimental Section
In differentiation and classification of samples from complex data collections, chemometrics methods can be powerful tools to obtain chemical information from the two-way images. The conventional approach of using a single wavelength or the total absorbance discards useful information that is available from DAD detectors. FuRES, sPLS-DA, and SVM classification tree were chosen because they are powerful classifiers. Two objectives were considered. The first objective was to differentiate premium chestnut honey from lower cost floral honey, regardless of the source region of the honey. The second objective is to recognize the Marmara geographical region from the other regions. Unfortunately, the other regions did not have enough samples to recognize each one so they were grouped together. This regional classification was based on three sub-objectives, using only floral honey, using only chestnut honey, and using the combined floral and chestnut honey samples. A total of 112 two-way chromatographic and spectral images were collected. One hundred bootstraps were used for resampling purposes. All evaluations for classification of the two-way chromatographic-spectral images of honey were performed using MATLAB R2016a (The MathWorks Inc., Natick, MA) (https://www.mathworks.com/products/new_products/release2016a.html). The computer operated under Microsoft Windows 7 Enterprise Edition 64-bit
Page 18 of 35
Enterprise Edition (Redmond, WA) and was implemented on a computer with an Intel Core™ i7 940 CPU Extreme Edition operating at 2.93 GHz with 8.00 GB RAM. 5
Discussion of Results
The aim of this research focuses on honey profiling based on the phenolic compounds presents in floral and chestnut honey from different geographical regions in Turkey. Data were collected by HPLC-DAD. FuRES, SVMTreeG, and sPLS-DA were performed to classify floral and chestnut honey with twoway chromatographic-spectral images after the preprocessing methods were applied. The difference between normalized and aligned data can be visualized with principal component analysis (PCA) given in the Supplementary Materials (Figure S2). The first two principal components span 63% of the variance and the scores of the chestnut honey have a wider spread before alignment. The variances for the two components increased to 74% for first two principal components, and the scores of the chestnut honey cluster closer together after the alignment. Although perfect separation is not achieved, improvement is observed after alignment. Classification trees for the chestnut and floral honey after normalization and alignment data from FuRES and SVMTreeG are given in the Supplementary Materials (Figure S3). In the classification trees, H represents the classification entropy that increases after alignment for the FuRES trees because the fuzzy entropy measures the distance between classes and by
Page 19 of 35
aligning the chromatograms some of the artificial differences between chromatograms of different classes were removed. This correction resulted in an increase of the fuzzy entropy. The SVMTreeG uses a crisp or non-fuzzy entropy so as long as the rule separates the objects of the class, the resulting entropy will be zero. The rules are numbered and Nc refers to the number of objects at each leaf of the tree. Classification rates of chestnut and floral honey evaluated by 4 Latin partitions and 100 bootstraps (95% confidence intervals) are given in Table 1. This table presents the percentage classification rates from normalized, aligned, square root transform, and the dissimilarity kernel. Note that the dissimilarity kernel is performed after the square root transformed data, normalization, and alignment. A significant improvement for aligned data was observed for the sPLS-DA method when the aligned data was transformed, and the classification rate increased from 89.9±0.5% to 96.1±0.4%. However, the dissimilarity kernel was 1.5% worse for the sPLS-DA method. The high classification rates resulted with FuRES and SVMTreeG as observed in Table 1. Both normalized and aligned data of chestnut and floral honey by FuRES and SVMTreeG gave similar classification rates ranging between 97.0±0.4% and 97.6±0.4%. The FuRES classification for the aligned data improved to 97.6±0.4%. However, the dissimilarity kernel slightly reduces the classification rate for the FuRES classifier because it is softer (i.e., from the fuzzy property) and
Page 20 of 35
therefore more sensitive to preprocessing. A similar trend is observed with the SVMTreeG classifier that the classification rate is slightly increased after transforming the data, but a decrement of 1.2% was observed with the dissimilarity kernel for the SVMTreeG classifier of chestnut and floral honey. These classification rates from FuRES and SVMTreeG are promising in that the chestnut and floral honey can be successfully classified based on their phenolic chemical profiles. Classifying the type of honey based on the phenolic composition of the floral and chestnut honey was the first objective of this research. The following discussion will be based on the regional classification of floral and chestnut honey. The honey samples were grouped into two classes, one for the Marmara region and the second for the other regions.
For the floral honey
samples, the other region class comprised samples from Eastern Anatolia, Central Anatolia, and the Black Sea. For the chestnut honey samples, the other region class comprised samples from the Black Sea, Central Anatolia, and the Mediterranean. The number of two-way objects for floral and chestnut honey are 18 and 14, respectively, while the number of two-way objects from the Marmara region for floral and chestnut honey are 34 and 46, respectively. When the regions are combined for floral and chestnut honey, the number of two-way objects was 32 while Marmara region consisted of 80 two-way objects.
Page 21 of 35
The classification rates of normalized, aligned, square root transformed, and dissimilarity kernel based on Marmara classification of floral honey with 100 bootstraps and 4 Latin partitions is given in Table 2 and the classification trees of regions of floral honey are provided in the Supplementary Materials (Figure S4). The classification entropy increased by 0.06 when aligned data is considered after normalization in FuRES classification method with the reason earlier mentioned. FuRES and SVMTreeG provided classification rates over 90% for normalized, aligned, and square root transformed, however, lower than 90% when the dissimilarity kernel is considered. The classification rates of aligned and square root transformed data for regions of floral honey are 92.5±0.8% and 92.0±0.9% for FuRES, respectively. SVMTreeG gave slightly similar classification rates for aligned data with 92.5±0.8% and for square root transformed data with 90.9±0.9%. These classification rates demonstrate that regional classifications for floral honey can be achieved with FuRES and SVMTreeG classification methods. sPLS-DA did not perform as well. The classification rates for regions of chestnut honey are provided in Table 3 while the corresponding classification trees of FuRES and SVMTreeG are given in the Supplementary Materials (Figure S5). The classification rates for chestnut honey in Table 3 are significantly improved for FuRES and SVMTreeG after the square root transform with 91.9±0.9% and 91.6±0.9%, respectively. sPLS-DA classification method performed poorly at 80% for all
Page 22 of 35
the preprocessing methods. Based on these improved classification rates, of the square root transformation benefited the FuRES and SVMTreeG. The combined regional classification rates for floral and chestnut honey are given in Table 4. The square root transform gave the best improvement of classification rate for all three classifiers. The highest classification rate was achieved with the FuRES at 90.8±0.6% with the square root transform for the combined regional classification of floral and chestnut honey. An improvement of the classification rate by 1% for SVMTreeG was also achieved with the square root transform for the combined regional classification of floral and chestnut honeys. These classification rates are promising for the classification of honey samples by their phenolic chemical profiles. 6
Conclusions
Phenolic compounds that are present in honey can be used for classification purposes of different honey samples by HPLC-UV analysis and chemometrics. The high classification rates by FuRES and SVMTreeG indicate that chestnut and floral honey can be successfully classified based on their phenolic compositions. The regional classification rates from FuRES and SVMTreeG demonstrate that the phenolic profiles are useful for authentication. This analytical method, HPLC-DAD coupled with the FuRES and SVMTreeG classification of the two-way images yielded a low-cost method for profiling honey that may be useful for authentication by flower
Page 23 of 35
type or region. In the future, the honey phenolic profiles may be correlated with treatments of specific diseases or symptoms, so specific honeys may be targeted for patients. 7
Acknowledgements
The authors are thankful for Xinyi Wang, Zewei Chen, Jalicia Ruttino, Anne Marie Esposito, and Amanda McKeon for their helpful suggestions. Chemical Mapping, Inc. is thanked for their support. Note The authors declare no conflict of interest.
Page 24 of 35
8
References
Al-Mamary, M., Al-Meeri, A., Al-Habori, M., (2002). Antioxidant activities and total phenolics of different types of honey. Nutrition Research 22(9), 10411047. Aljadi, A., Kamaruddin, M., (2004). Evaluation of the phenolic contents and antioxidant capacities of two Malaysian floral honeys. Food Chemistry 85(4), 513-518. Aloglu, A.K., Harrington, P.D., Sahin, S., Demir, C., (2016). Prediction of total antioxidant activity of Prunella L. species by automatic partial least square regression applied to 2-way liquid chromatographic UV spectral images. Talanta 161, 503-510. Alvarez-Suarez, J., Giampieri, F., Damiani, E., Astolfi, P., Fattorini, D., Regoli, F., Quiles, J., Battino, M., (2012). Radical-scavenging Activity, Protective Effect Against Lipid Peroxidation and Mineral Contents of Monofloral Cuban Honeys. Plant Foods For Human Nutrition 67(1), 31-38. Andrade, P., Ferreres, F., Amaral, M., (1997). Analysis of honey phenolic acids by hplc, its application to honey botanical characterization. Journal of Liquid Chromatography & Related Technologies 20(14), 2281-2288. Anklam, E., (1998). A review of the analytical methods to determine the geographical and botanical origin of honey. Food Chemistry 63(4), 549-562.
Page 25 of 35
Anupama, D., Bhat, K., Sapna, V., (2003). Sensory and physico-chemical properties of commercial samples of honey. Food Research International 36(2), 183-191. Barker, M., Rayens, W., (2003). Partial least squares for discrimination. Journal of Chemometrics 17(3), 166-173. Bertelli, D., Plessi, M., Sabatini, A., Lolli, M., Grillenzoni, F., (2007). Classification of Italian honeys by mid-infrared diffuse reflectance spectroscopy (DRIFTS). Food Chemistry 101(4), 1565-1570. Bogdanov, S., Jurendic, T., Sieber, R., Gallmann, P., (2008). Honey for Nutrition and Health: A Review. Journal of the American College of Nutrition 27(6), 677-689. Boser, P.E., Guyon, I.M., Vapnik, V.N., (1992). In A Training Algorithm for Optimal Margin Classifiers, COLT '92 Proceedings of the Fifth Annual Workshop on Computational Learning ACM, pp. 144-152. Brereton, R., Lloyd, G., (2014). Partial least squares discriminant analysis: taking the magic away. Journal of Chemometrics 28(4), 213-225. Can, Z., Yildiz, O., Sahin, H., Turumtay, E., Silici, S., Kolayli, S., (2015). An investigation of Turkish honeys: Their physico-chemical properties, antioxidant capacities and phenolic profiles. Food Chemistry 180, 133-141. Chen, P., Luthria, D., Harrington, P., Harnly, J., (2011). Discrimination Among Panax Species Using Spectral Fingerprinting. Journal of Aoac International 94(5), 1411-1421.
Page 26 of 35
Chen, Z., Zhang, X., Zhang, Z., (2016). Clinical risk assessment of patients with chronic kidney disease by using clinical data and multivariate models. International Urology and Nephrology 48, 2069–2075. Ciulu, M., Solinas, S., Floris, I., Panzanelli, A., Pilo, M., Piu, P., Spano, N., Sanna, G., (2011). RP-HPLC determination of water-soluble vitamins in honey. Talanta 83(3), 924-929. CORTES, C., VAPNIK, V., (1995). SUPPORT-VECTOR NETWORKS. Machine Learning 20(3), 273-297. Danezis, G.P., Tsagkaris, A.S., Camin, F., Brusic, V., Georgiou, C.A., (2016). Food authentication: Techniques, trends & emerging approaches. TrAC Trends in Analytical Chemistry 85, Part A, 123-132. Daniele, G., Maitre, D., Casabianca, H., (2012). Identification, quantification and carbon stable isotopes determinations of organic acids in monofloral honeys. A powerful tool for botanical and authenticity control. Rapid Communications in Mass Spectrometry 26(17), 1993-1998. Eteraf-Oskouei, T., Najafi, M., (2013). Traditional and Modern Uses of Natural Honey in Human Diseases: A Review. Iranian Journal of Basic Medical Sciences 16(6), 731-742. Grunkemeier, G.L., Wu, Y., (2004). Bootstrap resampling methods: something for nothing? The Annals of Thoracic Surgery 77(4), 1142-1144.
Page 27 of 35
Güneş, M.E., Şahin, S., Demir, C., Borum, E., Tosunoğlu, A., (2016). Determination of phenolic compounds profile in chestnut and floral honeys and their antioxidant and antimicrobial activities. Journal of Food Biochemistry, doi.org/10.1111/jfbc.12345. HARRINGTON, P., (1991). FUZZY MULTIVARIATE RULE-BUILDING EXPERT SYSTEMS - MINIMAL NEURAL NETWORKS. Journal of Chemometrics 5(5), 467-486. Harrington, P., (2006). Statistical validation of classification and calibration models using bootstrapped Latin partitions. Trac-Trends in Analytical Chemistry 25(11), 1112-1124. Harrington, P., (2015). Support Vector Machine Classification Trees. Analytical Chemistry 87(21), 11065-11071. Harrington, P., Kister, J., Artaud, J., Dupuy, N., (2009). Automated Principal Component-Based Orthogonal Signal Correction Applied to Fused Near Infrared-Mid-infrared Spectra of French Olive Oils. Analytical Chemistry 81(17), 7160-7169. Harrington, P., Vieira, N., Chen, P., Espinoza, J., Nien, J., Romero, R., Yergey, A., (2006). Proteomic analysis of amniotic fluids using analysis of variance-principal component analysis and fuzzy rule-building expert systems applied to matrix-assisted laser desorption/ionization mass spectrometry. Chemometrics and Intelligent Laboratory Systems 82(1-2), 283-293.
Page 28 of 35
Hoehse, M., Paul, A., Gornushkin, I., Panne, U., (2012). Multivariate classification of pigments and inks using combined Raman spectroscopy and LIBS. Analytical and Bioanalytical Chemistry 402(4), 1443-1450. Karoui, R., De Baerdemaeker, J., (2007). A review of the analytical methods coupled with chemometric tools for the determination of the quality and identity of dairy products. Food Chemistry 102(3), 621-640. Kivrak, I., (2015). Free Amino Acid Profiles of 17 Turkish Unifloral Honeys. Journal of Liquid Chromatography & Related Technologies 38(8), 855-862. Kucuk, M., Kolayli, S., Karaoglu, S., Ulusoy, E., Baltaci, C., Candan, F., (2007). Biological activities and chemical composition of three honeys of different types from Anatolia. Food Chemistry 100(2), 526-534. Marshall, S., Schneider, K., Cisneros, K., Gu, L., (2014). Determination of Antioxidant Capacities, alpha-Dicarbonyls, and Phenolic Phytochemicals in Florida Varietal Honeys Using HPLC-DAD-ESI-MSn. Journal of Agricultural and Food Chemistry 62(34), 8623-8631. Mateo, R., Bosch-Reig, F., (1998). Classification of Spanish unifloral honeys by discriminant analysis of electrical conductivity, color, water content, sugars, and pH. Journal of Agricultural and Food Chemistry 46(2), 393-400. Molan, P., (1999). Why honey is effective as a medicine. I. Its use in modern medicine. Bee World 80(2), 80-92. Perna, A., Intaglietta, I., Simonetti, A., Gambacorta, E., (2013). A comparative study on phenolic profile, vitamin C content and antioxidant
Page 29 of 35
activity of Italian honeys of different botanical origin. International Journal of Food Science and Technology 48(9), 1899-1908. Rearden, P., Harrington, P., Karnes, J., Bunker, C., (2007). Fuzzy rulebuilding expert system classification of fuel using solid-phase microextraction two-way gas chromatography differential mobility spectrometric data. Analytical Chemistry 79(4), 1485-1491. Sun, X., Chen, P., Cook, S., Jackson, G., Harnly, J., Harrington, P., (2012). Classification of Cultivation Locations of Panax quinquefolius L Samples using High Performance Liquid Chromatography - Electrospray Ionization Mass Spectrometry and Chemometric Analysis. Analytical Chemistry 84(8), 36283634. Vanhanen, L., Emmertz, A., Savage, G., (2011). Mineral analysis of monofloral New Zealand honey. Food Chemistry 128(1), 236-240. Wang, Z., Chen, P., Yu, L., Harrington, P., (2013). Authentication of Organically and Conventionally Grown Basils by Gas Chromatography/Mass Spectrometry Chemical Profiles. Analytical Chemistry 85(5), 2945-2953. Zerzucha, P., Daszykowski, M., Walczak, B., (2012). Dissimilarity partial least squares applied to non-linear modeling problems. Chemometrics and Intelligent Laboratory Systems 110(1), 156-162. Zerzucha, P., Walczak, B., (2012). Concept of (dis)similarity in data analysis. TrAC Trends in Analytical Chemistry 38, 116-128.
Page 30 of 35
Zhang, J., Zhang, Z., Xiang, Y., Dai, Y., Harrington, P., (2011). An emphatic orthogonal signal correction-support vector machine method for the classification of tissue sections of endometrial carcinoma by near infrared spectroscopy. Talanta 83(5), 1401-1409. Zhang, M., Harrington, P.d.B., (2013). Automated pipeline for classifying Aroclors in soil by gas chromatography/mass spectrometry using modulo compressed two-way data objects. Talanta 117, 483-491.
Page 31 of 35
Figures Captions Figure 1. The total absorbance chromatograms of Chestnut and Floral honey from HPLC-UV detection before (top) and after (bottom) retention time alignment. Figure 2. The two-way representation of the dataset of Floral honey from Marmara region based on chromatographic and UV spectral measurements after retention time alignment.
Figures
Figure 1. The total absorbance chromatograms of Chestnut and Floral honey from HPLC-UV detection before (top) and after (bottom) retention time alignment.
Page 32 of 35
Figure 2. The two-way representation of the dataset of Floral honey from Marmara region based on chromatographic and UV spectral measurements after retention time alignment.
Page 33 of 35
Table Captions Table 1. Classification rates of chestnut and floral honey with 4 partitions and 100 bootstraps (95% confidence intervals) Table 2. Classification rates based on regions of floral honey with 4 partitions and 100 bootstraps (95% confidence intervals) Table 3. Classification rates based on regions of chestnut honey with 4 partitions and 100 bootstraps (95% confidence intervals) Table 4. Classification rates based on combined regions of floral and chestnut honey with 4 partitions and 100 bootstraps (95% confidence intervals)
Tables Table 1. Classification rates of chestnut and floral honey with 4 partitions and 100 bootstraps (95% confidence intervals) Normalized Normalized Normalized/Aligned Aligned Square Root Dissimilarity Transform Kernel FuRESa
97.1±0.4%
97.0±0.4%
97.6±0.4%
96.4±0.4%
sPLS-DAb
94.0±0.5%
89.9±0.5%
96.1±0.4%
88.5±0.4%
SVMTreeGc
97.6±0.4%
97.4±0.4%
97.6±0.4%
96.2±0.4%
FuRES, a fuzzy rule-building expert system; b sPLS-DA, super partial least square discriminant analysis; c SVMTreeG, support vector machine classification tree. a
Page 34 of 35
Table 2. Classification rates based on regions of floral honey with 4 partitions and 100 bootstraps (95% confidence intervals) Normalized
FuRESa sPLS-DAb SVMTreeGc
Normalized Aligned
Normalized/Aligned Square Root Transform
Dissimilarity Kernel
93.8±0.7%
92.5±0.8%
92.0±0.9%
89±1%
91±1%
85±1%
83±1%
73.3±1.1%
94.0±0.7%
92.5±0.8%
90.9±0.9%
88.6±0.9%
FuRES, a fuzzy rule-building expert system; b sPLS-DA, super partial least square discriminant analysis; c SVMTreeG, support vector machine classification tree. a
Table 3. Classification rates based on regions of chestnut honey with 4 partitions and 100 bootstraps (95% confidence intervals) Normalized
Normalized Aligned
Normalized/Aligned Square Root Transform
Dissimilarity Kernel
FuRESa
90.4±1.1%
88.3±1.0%
91.9±0.9%
88.1±0.9%
sPLS-DAb
80.7±0.6%
81.0±0.5%
80.8±0.6%
79.0±0.5%
SVMTreeGc
89.2±1.0%
88.2±0.9%
91.6±0.9%
87.3±0.9%
FuRES, a fuzzy rule-building expert system; b sPLS-DA, super partial least square discriminant analysis; c SVMTreeG, support vector machine classification tree. a
Page 35 of 35
Table 4. Classification rates based on combined regions of floral and chestnut honey with 4 partitions and 100 bootstraps (95% confidence intervals) Normalized Normalized Normalized/Aligned Aligned Square Root Dissimilarity Transform Kernel FuRESa
89.7±0.8%
88.7±0.7%
90.8±0.6%
88.1±0.6%
sPLS-DAb
74.2±0.4%
76.9±0.5%
77.4±0.6%
74.2±0.5%
SVMTreeGc
89.3±0.8%
88.8±0.6%
89.8±0.6%
87.3±0.6%
FuRES, a fuzzy rule-building expert system; b sPLS-DA, super partial least square discriminant analysis; c SVMTreeG, support vector machine classification tree. a