Food Chemistry 245 (2018) 1052–1061
Contents lists available at ScienceDirect
Food Chemistry journal homepage: www.elsevier.com/locate/foodchem
Mid infrared spectroscopy and chemometrics as tools for the classification of roasted coffees by cup quality Ana Paula Craiga, Bruno G. Botelhob, Leandro S. Oliveirac, Adriana S. Francac,
T
⁎
a
PPCCA/Universidade Federal de Minas Gerais, Av. Antônio Carlos, 6627, 31270-901 Belo Horizonte, MG, Brazil DQ/Universidade Federal de Minas Gerais, Av. Antônio Carlos, 6627, 31270-901 Belo Horizonte, MG, Brazil c DEMEC/Universidade Federal de Minas Gerais, Av. Antônio Carlos, 6627, 31270-901 Belo Horizonte, MG, Brazil b
A R T I C L E I N F O
A B S T R A C T
Keywords: Coffee Cup quality Infrared spectroscopy
Sensory (cup) analysis is a reliable methodology for green coffee quality evaluation, but faces barriers when applied to commercial roasted coffees due to lack of information on roasting conditions. The aim of this study was to examine the potential of mid-infrared spectroscopy for predicting cup quality of arabica coffees of different roasting degrees. PCA analysis showed separation of arabica and robusta. A two-level PLS-DA Hierarchical strategy was employed, with coffee being classified as high or low quality in the first level and then separated according to cup quality in the second level. Validation results showed that the second level models exhibited 100% sensitivity and specificity in the training sets. For the test set, sensitivity ranged from 67% (rio zona) to 100% (soft) while specificity ranged from 71% (rio) to 100% (rioysh, hard). Thus, the proposed method can be used for the quality evaluation of arabica coffees regardless of roasting conditions.
1. Introduction
classified by cup quality into 4 categories, i.e., excellent, good, regular and abnormal, based on the Brazilian legislation (Brasil, 2003). The sensory parameters of the green coffee are ideally evaluated when roasting and grinding are conducted under controlled conditions allowing the cupper to clearly perceive the flavors and fragrances of the coffee. In the case of commercial roasted coffee, cup tasting is also the most widespread technique employed to evaluate the final quality of the product. However, the methodology faces barriers since small variations in the roasting process affect coffee in different ways, making its analysis a challenge. Sensory-based methods are time-consuming and require trained cupper experts, hampering their efficient implementation for routine analysis and inspection purposes. As a consequence, regulatory organizations and industry strive to correlate sensory and experimental data in order to calibrate instrumental screening methodologies (FeriaMorales, 2002). Recent studies have demonstrated correlations between cup quality and chemical parameters including volatiles (Piccino, Boulanger, Descroix, & Sing, 2014), diterpenes (Novaes, Oigman, Alves de Souza, Rezende, & Aquino Neto, 2015) and chlorogenic acids (Zanin, Corso, Kitzberger, Scholz, & Benassi, 2016). Nonetheless, fast and inexpensive analytical methods that allow a reliable quality assessment of coffees for routine analysis are still needed. Good examples of rapid and non-destructive fingerprinting techniques are mid and near-infrared spectroscopy, MIR and NIR (Craig,
The term ‘quality’ is officially defined by the International Organization for Standardization (ISO) as “the extent to which a group of intrinsic features satisfies the requirements, where requirement means need or expectation, which may be explicit, generally implicit, or binding” (ISO, 2000). Product quality can assume different meanings for consumers, producers, and regulating organizations. In terms of coffee, quality may result from a large number of determinant factors such as the production system, the aspect and chemical composition of the green or roasted beans, the roasting process, and the cup preparation (Esteban-Díez, González-Sáiz, & Pizarro, 2004; Ribeiro, Ferreira, & Salva, 2011). Green coffee beans contain a wide range of different chemical compounds that react and interact amongst themselves at all stages of coffee roasting, resulting in even more diverse final products (Ribeiro et al., 2011). Cup tasting is the most important tool for the quality assessment of green coffee. Sensory characteristics vary according to the producing country and must be considered as specific to each commercial origin (Bee et al., 2005). In Brazil, arabica coffee (Coffea arabica L.) is classified according to cup quality into up to seven categories, namely strictly soft, soft, barely soft, hard, rioysh, rio and rio zona (see Supplementary Material, Table S1). Robusta coffee (Coffea canephora), recognized by its appearance and wooden and earthy flavor, is
⁎
Corresponding author. E-mail addresses:
[email protected] (L.S. Oliveira),
[email protected] (A.S. Franca).
https://doi.org/10.1016/j.foodchem.2017.11.066 Received 4 May 2017; Received in revised form 26 September 2017; Accepted 16 November 2017 Available online 20 November 2017 0308-8146/ © 2017 Elsevier Ltd. All rights reserved.
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
available coffee samples (18.0, dark < L∗ < 25.0, light). Roasting times were defined for each green coffee sample, given that physical and chemical attributes of low and high quality coffees contribute to a slow or fast roasting. The weight loss was calculated as the weight difference, in percentage, of each sample before and after roasting, as described by the following equation:
Franca, & Irudayaraj, 2014a; Lohumi, Lee, Lee, & Cho, 2015). Recent researches have demonstrated the potential of both MIR and NIR in roasted coffee quality analysis (Barbin, Felicio, Sun, Nixdorf, & Hirooka, 2014), with applications in the discrimination between arabica and robusta species (Bertone, Venturello, Giraudo, Pellegrino, & Geobaldo, 2016), detection of low quality defective coffee beans (Craig, Franca, & Oliveira, 2012; Craig, Franca, Oliveira, Irudayaraj, & Ileleji, 2014b; Craig, Franca, Oliveira, Irudayaraj, & Ileleji, 2015) and adulteration (Reis, Botelho, Franca, & Oliveira, 2017; Reis, Franca, & Oliveira, 2013, 2016). Esteban-Díez et al. (2004), Ribeiro et al. (2011), and Belchior, Franca, and Oliveira (2016) have demonstrated the potential of NIR and MIR for the prediction of specific sensory parameters such as acidity, aftertaste, body/mouthfeel, bitterness, intensity and overall coffee quality. However, no record has been found in the literature on the application of MIR Spectroscopy for the discrimination of coffees according to cup quality regardless of their roasting degree. In view of the aforementioned, the aim of this study was to examine the potential of mid-infrared spectroscopy and chemometrics for predicting cup quality (according to Brazilian classification system) of coffees subjected to different roasting degrees. Samples were classified following the Brazilian legislation (Brasil, 2003) as soft, hard or hardish, rioysh, rio and rio zona. Since Brazilian commercial coffees may comprise blends of arabica and robusta species, the discrimination of arabica, pure robusta and blends of the two species was also investigated.
Wl = 100 ⎛ ⎝
wi−wf
⎜
⎞ ⎠ ⎟
wi
(1)
where Wl is weight loss, wi is the initial weight and wf is the final weight. From the initial 30 green coffee samples, 75 roasted samples were obtained. Eleven blends of arabica and robusta roasted coffees were produced, five of them with 30% robusta, and six of them with 50% robusta. The two blend proportions used in this study were established within the ranges used by Briandet, Kemsley, and Wilson (1996). Arabica samples used to produce the blends corresponded to hard, rioysh and rio coffees roasted to light, medium and dark degrees. The arabica and robusta blends were disposed in Falcon tubes and shaken for one minute in a tube shaker (Fisatom, Brazil). The total 86 samples obtained were stored at room temperature (19 °C) until analysis. 2.3. ATR-FTIR analysis A Shimadzu IRAffinity-1 FTIR Spectrophotometer (Shimadzu, Japan) with a DLATGS (Deuterated Triglycine Sulphate Doped with LAlanine) detector was used in the ATR-FTIR measurements that were performed in a dry atmosphere and room temperature (20 ± 0.5 °C). A horizontal ATR sampling accessory (MIRacleA) equipped with ZnSe cell was employed. Approximately (5 g) of the ground and roasted coffee was placed in the sampling accessory and pressed. The empty recipient was used to obtain the background spectrum. All spectra were recorded within a range of 4000–600 cm−1 with a 4 cm−1 resolution and 20 scans and submitted to background subtraction. All samples were analyzed in triplicate.
2. Materials and methods 2.1. Coffee samples and sensory analysis of green coffee Twenty-eight Coffea arabica L. and two Coffea canephora green coffee samples were obtained from twelve different farms located in the states of Minas Gerais and Espírito Santo, Brazil. The amount collected from each farm varied from 250 g to 1 kg. Twenty-seven of the samples were from the 2013/14 coffee crop, while the other three samples were from older crops (2012/13, 2010/11). Arabica coffee samples were roasted to a light to medium degree and prepared and classified according to cup quality following the Brazilian legislation (Brasil, 2003). Arabica coffee samples were classified by five trained cuppers in the following ranking (from best to worst): soft, hard or hardish, rioysh, rio and rio zona. Taking into account that Brazilian commercial coffees may comprise blends of arabica and robusta species, robusta samples were included in the study. Although recent studies have shown that variations in origin and genotype can be a factor in discriminating arabica green coffees using spectroscopic techniques (Bona et al., 2017; Marquetti et al., 2016), we have assumed that such differences would not be significant in our study. Twenty commercial coffees from different brands classified according to cup quality as soft, hard, and rioysh, were obtained from Sindicafé (Belo Horizonte, Brazil). These samples were subjected to color evaluation as described in Section 2.2, and used as reference for the determination of the roasting degree range of the roasted coffees used in this study.
2.4. Data analysis Preprocessing techniques were applied to the raw data prior to statistical analysis to compensate for any changes in experimental conditions and enhance the results. These preprocessings included: baseline correction, area and maximum value normalization, MSC (Multiplicative Scatter Correction), and 1st and 2nd derivatives with Savitzky-Golay smoothing. Prior to the statistical analysis all datasets were mean centered. The regions 4000–3050 cm−1, 2800–1800 cm−1 and 690–600 cm−1 were excluded in order to avoid noise effects. Principal Component Analysis (PCA) was applied to provide an explanation of the data variability and to identify possible outliers. Partial Least Squares Discriminant Analysis (PLS-DA) was used to develop classification models. Spectral data were randomly divided into calibration (75%) and validation (25%) sets. Cross-validation method (continuous blocks with 8 data splits) was applied to choose the optimal number of latent variables to be employed. The predictability of the resulting models was evaluated based on the classification error for the validation set. Sensitivity and specificity values were calculated for each coffee class according to the following equations (López, Pillar Callao, & Ruisánchez, 2015):
2.2. Roasting conditions Samples of 250 g were taken from each sample lot and roasted in a Hottop (KN-8828P) drum roaster. The roasting temperature varied from 75 °C to at most 230 °C. All samples were ground in a coffee grinder (MCF 55, Arbel, Brazil), sieved (particle diameter < 0.15 mm) (see Table 1), and subjected to color evaluation. Color measurements were performed using a tristimulus colorimeter (HunterLab Colorflex 45/0 Spectrophotometer, Hunter Laboratories, VA, USA) with standard illumination D65 and colorimetric normal observer angle of 10°. Samples were roasted to dark, medium and light roasting degrees defined according to luminosity (L∗) measurements, similar to commercially
Sensitivity =
N (true positives ) N (true positives ) + N (false negatives )
(2)
Specificity =
N (true negatives ) N (true negatives ) + N (false positives )
(3)
The softwares Matlab (The MathWorks, Co., Natick, MA) and the computational package PLS_Toolbox (Eigenvector Research, Inc.) were employed for the statistical calculations. 1053
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
Table 1 Green coffee and obtained roasted coffee samples and their respective color measurements (L*) and weight loss (WL) (%). Green coffee Sample
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Roasted coffee samples Cup quality
Soft Soft Soft Soft Soft Soft Soft Soft Soft Hard Hard Hard Hard Hard Hard Rioysh Rioysh Rioysh Rioysh Rioysh Rio Rio Rio Rio Rio zona Rio zona Rio zona Rio zona Robusta Robusta
Cupper
1 2 2 2 2 3 2 2 3 1 3 2 4 4 2 3 4 2 2 2 5 2 4 3 3 3 2 2 3 4
Crop
13/14 12/13 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14 10/11 12/13 13/14 13/14 13/14 13/14 13/14 13/14 13/14 13/14
a
b
c
d
L*
WL (%)
L*
WL (%)
L*
WL (%)
L*
WL (%)
24.04 22.13 22.51 21.92 24.18 24.04 23.8 25.21 19.79 21.86 23.56 25.54 22.7 25.68 22.75 23.12 25.28 23.55 23.65 23.86 24.23 25.39 22.6 25.47 20.89 22.06 25.1 24.51 23.91 26.64
13.27 16.28 16.52 14.90 13.96 13.83 13.91 12.49 15.32 13.59 13.71 14.47 14.87 14.89 12.66 15.85 14.00 14.38 12.58 13.07 12.89 15.70 23.67 16.10 21.35 22.83 14.06 13.37 17.61 14.99
21.34 19.42
14.24 18.94
18.54
16.01
22.83 21.72 22.38
14.79 15.04 14.12
18.77 21.08 22.07
15.80 15.06 14.12
21.47
17.23
20.37 22.43 23.92 21.66 21.76 21.18 20.28 22.87 23.06 22.03 22.2 22.81 25.06
14.46 14.18 15.02 15.63 16.78 13.35 16.99 16.23 13.74 13.77 13.18 13.38 16.32
21.88 20.94
14.09 16.08
19.21
17.43
19.26
14.38
18.97
14.56
19.15 20.54 20.56
17.99 14.76 14.73
19.79 22.52
14.80 17.38
18.9 18.77
15.28 19.42
19.52 18.87
18.86 22.10
23.51 22.66 20.69 20.91
14.27 14.61 19.35 18.27
22.2 21.56
15.74 16.96
21.69
14.72
19.48
19.48
a, b, c, and d = roasted coffees with different roasting degrees obtained for each green coffee sample; for a given sample, a and d represents the lightest and darkest roasts, respectively.
3. Results and discussion
extensively altered in terms of degrees of branching and of polymerization, being thus partly solubilized. Protein is denatured, the amino acids composition is changed, and free amino acids react to form Maillard products. Chlorogenic acids and other organic acids are for the most part decomposed. Overall, there are only slight changes in the lipid fraction (Bonnlander, Eggers, Engelhardt, & Maier, 2005). In general, soft, hard and robusta coffees were evenly roasted and released a highly pleasant smell. Low quality coffees (rioysh, rio and rio zona) roasted unevenly, and presented dark and light beans in the same roasting batch. The presence of defects in low quality coffees will contribute to an uneven roast. While immature beans become yellowish, black beans appear to be burnt (Mendonça, Franca, & Oliveira, 2015). Broken and shell beans become darker quickly due to their reduced volume in relation to normal beans. Husk fragments may be burnt during roasting, contributing to the depreciation of the beverage quality. Luminosity (L∗) and weight loss results are displayed in Table 1. Roasting times were adjusted in order to obtain coffees with light, medium and dark roasting degrees. These degrees were defined on the basis of L∗ measurements of the ground coffee, using commercial ground coffees as reference. Roasting times varied from 12 min 30 s to 20 min. Overall, in order to achieve a given pre-established roasting degree on the basis of luminosity values, samples lost approximately 13–24% of weight, as shown in Table 1. The differences observed in the percentage of weight loss can be explained by the initial moisture content of each sample, which varies based on the processing and storage conditions of the beans. Other variable that may influence the roasting time and weight loss of low quality coffees is the presence of defective (black, immature, sour) beans. Previous studies indicated that these beans may roast slower than normal ones, especially due to their lower levels of carbohydrates, which will be converted into Maillard
3.1. Samples The overall appearance of the green coffee beans used in this study can be viewed in the Supplementary material (Fig. S1). Soft and hard coffee beans exhibit characteristic arabica beans features, such as greenish or greyish color, and plano-convex shape, grooved on the flat side. No visual defects were observed in soft samples, while immature beans were present in hard coffee samples. Rioysh beans exhibited defects mostly associated with irregular shape and integrity, such as broken beans, bean fragments and insect damaged beans. Rio and rio zona beans presented high percentage of defects associated with irregular appearance and shape, presence of fruit parts such as husks, and foreign matter such as sticks and small stones. Robusta beans exhibited their typical features, being brownish, smaller and rounder with a tighter center cut than those of arabica. It is expected that the amount of defects will increase as cup quality decreases, and there are many possible variations on the number and type of defects that can result in the same cup quality, especially in the case of low quality coffees. Furthermore, the information on the amount of defects will not be available for commercially available roasted coffee samples. Given that the goal of our study was to verify whether or not samples could be separated exclusively by sensory analysis, a quantitative evaluation of the distribution of defects in our samples was not performed. During roasting a series of chemical changes occurs. Water, including moisture originally present in the green bean and water generated by reactions, as well as carbon dioxide escape from the bean. The Maillard reaction generates melanoidins, organic volatiles, water and carbon dioxide. Mono- and disaccharides are consumed. Sucrose is hydrolysed and pyrolysed. Polysaccharides, except cellulose, are 1054
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
Fig. 1. Mean average ATR-FTIR spectra of coffee powder. (a) original (without pre-processing); (b) after MSC normalization; and (c) 2nd derivative — soft, — hard, — rioysh, — rio, — rio zona, — robusta, — arabica and robusta blend.
intensity of arabica coffees at 2922, 2852, 1743, and 1153 cm−1 was evidenced. The first three bands are mostly associated with lipids absorption, and assigned respectively to the asymmetric stretching of CH2, symmetric stretching of CH2, and C]O stretching adjacent to CeOe group in esters (Silverstein, Webster, Kiemle, & Bryce, 2015). The last peak, at 1153 cm−1, has been associated with polysaccharides. Particularly, Robert, Marquis, Barron, Guillon, and Saulnier (2005) attributed a peak at 1160 cm−1 to the asymmetric CeOeC stretching mode
and pyrolysis products (Craig et al., 2014b; Franca, Oliveira, Mendonça, & Silva, 2005a).
3.2. Qualitative evaluation of the spectra Fig. 1 shows the average original and preprocessed spectra of roasted coffee. When MSC normalization was applied to account for scaling and baseline effects (Fig. 1b), the relative higher absorbance 1055
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
Fig. 2. PCA scores scatter plots (a, b, c) and loading plots (d) of 2nd derivative ATR-FTIR spectra (2960–2800/1800–1680/1249–1130 cm−1) of coffee powder; (a) PC1 vs. PC2; (b) scores on PC1; (c) scores on PC2; (d) (—) Loadings on PC1, (—) loadings on PC2.
Table 2 Classification results of PLS-DA models used in the classification of coffees according to cup quality. Actual Class
Predicted Predicted Predicted Predicted Predicted
as as as as as
Soft Hard Rioysh Rio Rio Zona
Sensitivity Specificity
Predicted Predicted Predicted Predicted Predicted
as as as as as
Sensitivity Specificity
Soft Hard Rioysh Rio Rio Zona
Soft (n = 13)
Hard (n = 11)
Rioysh (n = 9)
Rio (n = 9)
Rio Zona (n = 7)
13 0 0 0 0
2 8 1 0 0
0 0 9 0 0
0 0 0 9 0
0 0 0 0 7
1.000 0.943
0.727 1.000
1.000 0.974
1.000 1.000
1.000 1.000
Soft (n = 6)
Hard (n = 6)
Rioysh (n = 4)
Rio (n = 5)
Rio Zona (n = 3)
4 1 1 0 0
2 3 1 0 0
0 0 2 1 1
0 0 0 4 1
0 0 0 1 2
0.667 0.944
0.500 0.944
0.500 0.900
0.800 0.842
0.667 0.905
Training set
Test Set
the NIR spectra of arabica and robusta coffees were also attributed to higher and lower absorbance intensities at regions characterized by lipids and caffeine absorption, respectively. Robusta coffee also showed relatively higher absorbance intensity at the region 1130–950 cm−1, where carbohydrates strongly absorb. However, a precise chemical assignment in this region of the spectrum is challenging due to highly coupled vibration modes of polysaccharide backbones. According to the literature, galactans absorb at 1134 cm−1 due to CeOeC stretching vibration. Cellulose shows strong bands at 1059 cm−1 and 1033 cm−1 (ring vibration) (Kacurakova, Capek,
of glycosidic link in cellulose. In the study by Kemsley, Ruault, and Wilson (1995), the most prominent differences in the mid-infrared spectra of roasted arabica and robusta coffees were observed at 1744 cm−1 and 1150 cm−1. At these bands arabica absorbed more intensely than robusta, corroborating our findings. Pure robusta and arabica and robusta blends exhibited relatively higher absorbance intensity at 1643 cm−1, where the C]O stretching vibration in caffeine takes place (Falk, Gil, & Iza, 1990). In fact, robusta coffee has almost twice the caffeine content of arabica (Ky et al., 2001). In the study by Esteban-Díez et al. (2004) the main differences between
1056
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
Sasinkova, Wellner, & Ebringerova, 2000), and arabinogalactans may absorb at 1065 cm−1 and 1020 cm−1 (Robert et al., 2005) or 1078 cm−1 and 1043 cm−1 (CeC and ring vibrations) (Kacurakova et al., 2000). Furthermore, the arabinogalactan/mannan fraction of green coffee beans has been related to major bands at 1066 cm−1 (CeC) and 1034 cm−1 (ring vibration) (Kacurakova et al., 2000). The differences in the total content and structure of polysaccharides in arabica and robusta species have been discussed in the literature. It is well known that the predominant polysaccharides in arabica and robusta coffees are galactomannans, arabinogalactans and cellulose. Although the composition of these polysaccharides is similar in both species, differences in the relative amounts of galactomannan and arabinogalactan have been reported. In the green beans, robusta has shown higher levels of arabinogalactans (Fischer, Reimann, Trovato, & Redgwell, 2001; Nunes & Coimbra, 2002). In contrast, Nunes and Coimbra (2002) found that the amount of arabinogalactans in roasted arabica and robusta coffees were on the same range, but the amount of galactomannans in robusta was near half of those in arabica. Thaler (1979) studied the predominance of high-polymeric carbohydrates (molecular weight > 10,000) consisting only of mannans and galactans in roasted coffees. Whereas arabica coffee released greater amount of high polymeric carbohydrates, less than a half of these polysaccharides consisted of mannans and galactans. The opposite was observed for robusta coffee. Overall, reliable conclusions are difficult to be made since the results reported generally have to be extrapolated and recalculated from a series of values obtained after various treatments and extractions that are not comparable (Fischer et al., 2001). The 2nd-order derivative calculation results in a spectral pattern display of absorption bands pointing down rather than up (Fig. 1c). 2nd derivative can be very helpful in spectral interpretation due to the fact that the peak location is maintained with those in the log(l/R) spectral pattern, and the band resolution is enhanced. The major advantage is that, differently from the 1st derivative, 2nd derivative generates few if any false bands in the negative direction. However, two false valleys in the positive ordinate scale may be generated for every band in the negative direction. In addition to the discriminating bands observed at the original and MSC spectra (Fig. 1a and b), Fig. 1c suggested that pure robusta and arabica and robusta blends absorbed slightly more intensely at 1058 and 1033 cm−1. From a visual examination, it was not possible to observe a natural clustering of arabica samples by cup quality, and statistical analysis is required to better explore their differences.
Table 3 Classification results of hierarchical PLS-DA models used in the classification of coffees according to cup quality. First Level Model – Low Quality/High Quality Differetiation Actual Class
Predicted as Low Quality Predicted as High Quality Sensitivity Specificity
Predicted as Low Quality Predicted as High Quality Sensitivity Specificity
Training Set
Low Quality (n = 25)
High Quality (n = 24)
25
1
0
23
1.000 0.960
0.960 1.000
Low Quality (n = 12)
High Quality (n = 12)
11
1
1
11
0.917 0.917
0.917 0.917
Test Set
Second Level Model – Soft/Hard Differetiation Actual Class
Training Set
Soft (n = 13)
Hard (n = 11)
Predicted as Soft Predicted as Hard
13 0
0 11
Sensitivity Specificity
1.000 1.000
1.000 1.000
Actual Class
Test Set
Soft (n = 6)
Hard (n = 6)
Predicted as Soft Predicted as Hard
6 0
1 5
Sensitivity Specificity
1.000 0.833
0.833 1.000
Second Level Model – Rioysh/Rio/Rio Zona Differetiation Actual Class
Predicted as Rioysh Predicted as Rio Predicted as Rio Zona Sensitivity Specificity
Rioysh (n = 9)
Rio (n = 9)
Rio Zona (n = 7)
9
0
0
0 0
9 0
0 7
1.000 1.000
1.000 1.000
1.000 1.000
Actual Class
Predicted as Rioysh Predicted as Rio Predicted as Rio Zona Sensitivity Specificity
3.3. Exploratory analysis
Training Set
PCA analysis was applied to the spectral data subjected to different mathematical preprocessings (see Fig. 2). The full spectral range as well as combinations of specific ranges were used. The best sample clustering was obtained with the application of 2nd derivative to the ranges of 2960–2800, 1800–1680 and 1249–1130 cm−1 (regions highlighted in Fig. 2c). Seven out of the original 86 samples with Q or T2 statistics values over the threshold of 95% were classified as outliers and removed from the model that was rebuilt. Four out the five robusta samples were outside the PCA limits, with a 95% of confidence level. Being outside the PCA limits does not mean the sample was not modeled in the new PCA space, but the sample was more unusual (Wise & Gallagher, 2013). This was somehow expected, since the chemical difference between species is significantly greater than the chemical difference between arabica samples of different cup qualities. These samples are not outliers, but observations that varied the most from the majority of the sample population. A preliminary examination of Fig. 2a suggests that samples were primarily separated based on species along the PC1 axis, and then separated by cup quality along the PC2 axis. Separation between pure arabica and the remaining samples could be observed along the PC1 axis, which represented 98.26% of the variance amongst samples (see
Test Set
Rioysh (n = 4)
Rio (n = 5)
Rio Zona (n = 3)
3
0
0
1 0
4 1
1 2
0.750 1.000
0.800 0.714
0.667 0.889
1057
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
Fig. 3. VIP Scores for (a) First Level Model – Discrimination between low quality/high quality coffees; (b) Second Level Model – Discrimination between soft and hard coffees; (c) Second Level Model – Discrimination between Rioysh, Rio and Rio Zona Coffee. Dashed line corresponds to the variable significance threshold.
acid composition (Romano et al., 2014). A trend in the discrimination of (a) soft and part of hard samples (positive scores) and (b) rio and rio zona coffee (negative scores) was observed at the scatter plot of the PCA scores on PC2 (Fig. 2c). Rioysh, pure robusta, and arabica and robusta blends were scattered along the PC2 axis. Regions around 2922 cm−1, 2840 cm−1 and 1745 cm−1 exhibited large loading scores (> 0.1 and < −0.1) (Fig. 2d). The high loading scores observed at 1728 cm−1 (C]O str in aliphatic esters, aldehydes and ketones) resulted from the false valley generated in the positive ordinate scale for the negative peak at 1745 cm−1 of the 2nd derivative spectra, as shown in Fig. 2c.
Fig. 2b). The loadings plot of PC1 (Fig. 2d) indicates that the greatest contributors to the separation were prominent bands related to lipids absorption at 2922 cm−1 (asym CH2 str), 2840 cm−1 (sym CH2 str), and 1745 cm−1 (C]O stretching adjacent to CeOe group in esters) (Silverstein et al., 2015). At these bands, pure arabica samples, regardless of cup quality, absorbed more intensely than robusta and arabica and robusta blends. This result is in agreement with the fact that arabica coffee has twice the lipid content of robusta (Alves, Casal, Alves, & Oliveira, 2009). Other investigations also revealed differences in the lipid fraction of arabica and robusta coffees regarding their unsaponifiable fraction (Pacetti, Boselli, Balzano, & Frega, 2012) and fatty 1058
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
classes presenting sensitivity results worse than 0.700. It is interesting to notice that none of the lower quality coffees (rio, rioysh and rio zona) were misclassified as a higher quality coffee (soft or hard), and none of the higher quality coffees were classified as rio or rio zona. This findings cleary indicates that high quality coffee are different from low quality coffees. However, due to small differences encountered among classes, satisfactory classifications of the test sets were not obtained. To overcome this class similarity, an alternative approach called Hierarchical Models (HM) was tested. When HM is used, instead of building one multi-class model containing all studied classes, a sequence of simpler models is built. This reduces model complexity and improves model interpretability, because relevant features for discrimination of classes are presented gradually. This approach has already been successfully applied for the simultaneous detection of multiple adulterants in coffee (Reis et al., 2017). Based on the HM strategy, a two-level PLS-DA Hierarchical Model was built. In the first level, coffee samples were classified as high quality (HQ) or low quality coffees (LQ). In the second level, HQ were classified as soft or hard and LQ were classified as rioysh, rio and rio zona (see Supplementary Material, Fig. S2). The first HM level was comprised of one model, built using the same 73 coffee samples described previously. These samples were grouped as HQ (36 samples) and LQ (37 samples). About 70% of the samples of each group were randomly selected to compose the training set (24 HQ samples and 25 LQ samples), and the remaining samples were used as a test set (24 samples, 12 from each class). The data were preprocessed using MSC and the best LV number was also selected using CVCE. 7 LV were selected, accounting for 94.67% of X variance and for 90.61% of Y variance. Excellent classification was obtained in the first level, with only one HQ misclassified sample in the training set and one of each class in the test set (Table 3). The Variable Importance Projection (VIP) Scores from the first level model are presented in Fig. 3. According to the VIP Scores, the most important spectral regions for the differentiation between HQ and LQ coffees were 2918, 1749, 1038 and 806 cm−1. The bands at 2918 and 1749 cm−1 are related to lipid absorption in coffees and the absorption in 1038 and 806 cm−1 may be related to carbohydrate contents. The Second level of the HM was composed of two sub models, one for differentiating HQ coffees into soft and hard coffes, and another model to differentiate LQ coffees into rioysh, rio and rio zona. The soft/ hard model was built using 36 samples (19 soft samples and 17 hard samples). These samples were randomly divided into training and test sets. MSC and mean centering were used as preprocessing and, based on the smallest CVCE, 7 LV were chosen, accounting for 88.35% of X variance and 84.09% of Y variance. As can be seen in Table 3, no sample was misclassified in the training set and only one hard coffee sample was misclassified as soft samples in the test. According to the VIP Scores, the lipid absorption bands (2918 and 1749 cm−1) were also relevant in the differentiation between HQ coffees. Caffeine, which can be associated with the absorption in the 1625 cm−1, was also a relevant feature for the discrimination, along with several bands related to differents carbohydrates absorptions, that were not relevant in the first level model (1150, 925, 815 and 760 cm−1), and can be considered specific to discriminate soft and hard coffees, because they were not relevant in the LQ/HQ coffee discrimination (Fig. 3b). The rioysh/rio/rio zona model was built using 37 samples (13 rioysh samples, 14 rio samples and 10 rio zona samples) that were also randomly divided into training and test sets. Data were also preprocessed with first derivative followed by Savitsky-Golay smoothing and mean centering, and 4 LV were chosen based on the smallest CVCE, explaining 96.23% of X variance and 81.62% of Y variance. No misclassification occurred in the training set, and only one sample of each was misclassified in the test set (one rioysh and one rio zona misclassified as rio and one rio misclassified as rio zona). The VIP Scores for the rioysh/rio/rio zona model are presented on Fig. 3c. It is possible to notice that there is no specific band or spectral region that
Based on the PCA and loading scores on PC2 (Fig. 2c and d), it is inferred that soft samples exhibited positive scores on PC2 because of their slightly higher absorbance intensity at 1745 cm−1 (C]O str in esters), and rio and rio zona exhibited negative scores on PC2 because of their slightly higher absorbance intensity at 2922 cm−1 and 2840 cm−1 (sym and asym CH2 str). Most of hard samples were scattered among the soft coffee samples, but some exhibited negative scores on PC2. Our findings suggest that there are chemical differences in the composition of the lipid fraction of high and low quality coffees. In fact, a higher oxidation of triglycerides in rioysh, rio and rio zona coffees, with the formation of free fatty acids and volatile compounds could impart an “off-flavor” to the beverage. From the best of our knowledge, a comparative study on the fatty acids profile of high and low cup quality coffees has not been reported. According to Speer and KöllingSpeer (2006) the content of free fatty acids in green coffee can increase significantly (from 1 g·kg−1 to 7 g·kg−1, in one year) in coffees stored at high temperatures and high humidity, which could also contribute to flavor depreciation. As far as total fat content is concerned, Franca, Mendonça, and Oliveira (2005b) found that soft green coffee beans exhibited higher levels in comparison to hard, rioysh and rio. After roasting, no significant difference was observed. Comparative evaluations on the total lipids content in normal and defective coffee beans, which are directly associated with the beverage cup quality, suggested that defective beans contain slightly less lipids than normal beans (Craig et al., 2014a; Franca et al., 2005a; Oliveira, Franca, Mendonça, & Barros-Júnior, 2006; Ramalakshmi, Kubra, & Rao, 2007), although no difference in the fatty acid content was observed (Oliveira et al., 2006). It is well known that cup quality may be correlated to a number of chemical compounds. Previous researches have linked cup quality to chlorogenic acids, protein, lipids, caffeine, trigonelline and volatile compounds (Franca et al., 2005b; Ribeiro, Augusto, Salva, & Ferreira, 2012). In the present study, a relatively small range of the FTIR spectrum where particularly lipids absorb, explained most of the variance between the coffee samples. Sensory or cup quality is ideally evaluated employing light to medium roasted coffees, so cuppers can fully perceive their aroma and flavors. However, lower quality commercial coffees are often roasted to extra dark degrees making an appropriate cupping difficult. In this study green coffee samples previously subjected to cup quality evaluation were roasted to different degrees and analyzed by FTIR. Regardless of roasting degree, samples were separated by species and cup quality, suggesting that the methodology could serve as a suitable tool for the analysis of commercial coffees. 3.4. Classification analysis PLS-DA was performed with the purpose of developing models to classify coffee samples by cup quality. The model was built using 73 samples previously analised by a sensory panel. 19 samples were classified as “soft”, 17 were classified as “hard”, 13 as “rioysh”, 14 as “rio” and 10 as “rio zona”. Approximately 70% of these samples (49 samples) were randomly selected as a training set and the rest of the samples were used as a test set. Recognition ability was calculated as the percentage of members of the calibration set that were correctly classified, and prediction ability was calculated as the percentage of members of the validation set that were correctly classified. Based on the lower cross validation classification error (CVCE), estimated using continuous blocks with 8 splits, 9 latent variables were chosen and employed on the model. Data were preprocessed using first derivative followed by a Savistky-Golay smoothing and mean-centering. All samples from soft, rioysh, rio and rio zona were correctly classified in the training set (see Table 2). Only the hard class presented three false negative samples, two being classified as soft and one being classified as rioysh, with a sensitivity of 0.727. Specificity values ranged from 0.943 (soft) to 1.000 (hard, rio and rio zona). For the test set, results were worse than the one presented by the training set, with four 1059
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
characterizes one specific class. All classes have similar VIP patterns. The only differences among them are the VIP intensities. Rioysh coffes presented higher VIP values in the lipid spectral region (2800–2900 cm−1), while rio zona presented higher VIP Scores in the fingerprinting region (1200–700 cm−1), which can be correlated with carbohydrates. Rio coffees presented intermediate values in all of the highlighted spectral regions. This behavior reflects the quality of the coffees, where rioysh coffees are the ones with the highest quality among the three classes, while rio zona is the worst and rio is the middle class. When the multi-class (Table 2) and the HM (Table 3) models are compared, it is possible to see that improvement was achieved mainly on the HQ coffees. Soft coffee sensitivity rose from 0.667 to 1.000 in the test set and hard coffee sensitivity rose from 0.727 to 1.000 in the training set and from 0.500 to 0.833 in the test set. In the LQ coffees, improvements were more modest when the test set results were evaluated, but were also present. Sensitivity in the rioysh classification rose from 0.500 to 0.750, and the specificity of the rio classification went from 0.714 to 0.842. Some parametres lowered, such as specificity for rio zona classification (that dropped from 0.905 to 0.889) and specificity for soft classification (from 0.933 to 0.833), but in general more improvements in the model performance than degradation were obtained when using the HM approach.
Robusta ratio in roasted and ground coffee. Food Control, 59, 683–689. Bona, E., Marquetti, I., Link, J. V., Makimori, G. Y. F., da Costa Arca, V., Guimarães Lemes, A. L., ... Poppi, R. J. (2017). Support vector machines in tandem with infrared spectroscopy for geographical classification of green arabica coffee. LWT – Food Science and Technology, 76, 330–336. Bonnlander, B., Eggers, R., Engelhardt, U. H., & Maier, H. G. (2005). Roasting. In A. Illy, & R. Viani (Eds.). Espresso coffee, the science of quality (pp. 179–214). San Diego: Elsevier Academic Press. Brasil (2003). Instrução Normativa n. 8, de 11 de junho de 2003. Regulamento técnico de identidade e de qualidade para a classificação do café beneficiado grão cru. Brasília: Ministério da Agricultura, Pecuária e Abastecimento (Normative Instruction n.8, Technical norms of identity and quality for green coffee classification). Briandet, R., Kemsley, K., & Wilson, R. H. (1996). Discrimination of Arabica and Robusta in instant coffee by Fourier transform infrared spectroscopy and chemometrics. Journal of Agricultural and Food Chemistry, 44, 170–174. Craig, A. P., Franca, A. S., & Irudayaraj, J. (2014a). Vibrational spectroscopy for food quality and safety screening. In A. K. Bhunia, M. S. Kim, & C. R. Taitt (Eds.). High throughput screening for food safety assessment: Biosensor technologies, hyperspectral imaging and practical applications (pp. 165–194). Cambridge: Woodhead Publishing Ltd. Craig, A. P., Franca, A. S., & Oliveira, L. S. (2012). Discrimination between defective and non-defective roasted coffees by diffuse reflectance infrared Fourier transform spectroscopy. LWT – Food Science and Technology, 47, 505–511. Craig, A. P., Franca, A. S., Oliveira, L. S., Irudayaraj, J., & Ileleji, K. (2014b). Application of elastic net and infrared spectroscopy in the discrimination between defective and non-defective roasted coffees. Talanta, 128, 393–400. Craig, A. P., Franca, A. S., Oliveira, L. S., Irudayaraj, J., & Ileleji, K. (2015). Fourier transform infrared spectroscopy and near infrared spectroscopy for the quantification of defects in roasted coffees. Talanta, 134, 379–386. Esteban-Díez, I., González-Sáiz, J. M., & Pizarro, C. (2004). Prediction of sensory properties of espresso from roasted coffee samples by near-infrared spectroscopy. Analytica Chimica Acta, 525, 171–182. Falk, M., Gil, M., & Iza, N. (1990). Self-association of caffeine in aqueous solution: an FTIR study. Canadian Journal of Chemistry, 68, 1293–1299. Feria-Morales, A. M. (2002). Examining the case of green coffee to illustrate the limitations of grading systems/expert tasters in sensory evaluation for quality control. Food Quality and Preference, 13, 355–367. Fischer, M., Reimann, S., Trovato, V., & Redgwell, R. J. (2001). Polysaccharides of green Arabica and Robusta coffee beans. Carbohydrate Research, 330, 93–101. Franca, A. S., Mendonça, J. C., & Oliveira, S. D. (2005b). Composition of green and roasted coffees of different cup qualities. LWT – Food Science and Technology, 38, 709–715. Franca, A. S., Oliveira, L. S., Mendonça, J. C., & Silva, X. A. (2005a). Physical and chemical attributes of defective crude and roasted coffee beans. Food Chemistry, 90, 89–94. ISO (2000). Quality management system, principles and terminology. Geneva: International Organization for Standadization ISO 9000:2000. Kacurakova, M., Capek, P., Sasinkova, V., Wellner, N., & Ebringerova, A. (2000). FT-IR study of plant cell wall model compounds: pectic polysaccharides and hemicelluloses. Carbohydrate Polymers, 43, 195–203. Kemsley, E. K., Ruault, S., & Wilson, R. H. (1995). Discrimination between Coffea arabica and Coffea canephora variant robusta beans using infrared spectroscopy. Food Chemistry, 54, 321–326. Ky, C. L., Louarn, J., Dussert, S., Guyot, B., Hamon, S., & Noirot, M. (2001). Caffeine, trigonelline, chlorogenic acids and sucrose diversity in wild Coffea arabica L. and C. canephora P. accessions. Food Chemistry, 75, 223–230. Lohumi, S., Lee, S., Lee, H., & Cho, B.-K. (2015). A review of vibrational spectroscopic techniques for the detection of food authenticity and adulteration. Trends in Food Science & Technology, 46, 85–98. López, M. I., Pillar Callao, M., & Ruisánchez, I. (2015). A tutorial on the validation of qualitative methods: From the univariate to the multivariate approach. Analytica Chimica Acta, 891, 62–72. Marquetti, I., Link, J. V., Lemes, A. L. G., Scholz, M. B. S., Valderrama, P., & Bona, E. (2016). Partial least square with discriminant analysis and near infrared spectroscopy for evaluation of geographic and genotypic origin of arabica coffee. Computers and Electronics in Agriculture, 121, 313–319. Mendonça, J. C. F., Franca, A. S., & Oliveira, L. S. (2015). Physical characterization of non-defective and defective arabica and robusta coffees before and after roasting. Journal of Food Engineering, 92, 474–479. Novaes, F. J. M., Oigman, S. S., Alves de Souza, R. O. M., Rezende, C. M., & Aquino Neto, F. R. (2015). New approaches on the analyses of thermolabile coffee diterpenes by gas chromatography and its relationship with cup quality. Talanta, 139, 159–166. Nunes, F. M., & Coimbra, M. A. (2002). Chemical characterization of the high-molecularweight material extracted with hot water from green and roasted robusta coffees as affected by the degree of roast. Journal of Agricultural and Food Chemistry, 50, 7046–7052. Oliveira, L. S., Franca, A. S., Mendonça, J. C., & Barros-Júnior, M. C. (2006). Proximate composition and fatty acids profile of green and roasted defective coffee beans. LWT – Food Science and Technology, 39, 235–239. Pacetti, D., Boselli, E., Balzano, M., & Frega, N. G. (2012). Authentication of Italian Espresso coffee blends through the GC peak ratio between kahweol and 16-O-methylcafestol. Food Chemistry, 135, 1569–1574. Piccino, S., Boulanger, R., Descroix, F., & Sing, A. S. C. (2014). Aromatic composition and potent odorants of the “specialty coffee” brew “Bourbon Pointu” correlated to its three trade classifications. Food Research International, 61, 264–271. Ramalakshmi, K., Kubra, I. R., & Rao, L. J. M. (2007). Physicochemical characteristics of
4. Conclusions We have demonstrated the potential of FTIR and chemometrics for the classification of coffee by cup quality according to the Brazilian legislation. From the best of our knowledge, this is the first time that a methodology based on FTIR is applied to the classification of coffees by cup quality regardless of their roasting degree. The classification results demonstrated that the methodology proposed could be a useful, rapid and reproducible tool for quality control and inspection purposes. Considering the frequent practice of blending arabica and robusta for the production of commercial coffees in Brazil, a rapid methodology for cup quality assessment of blends must be attempted in the future. Furthermore, the same should be attempted for the cup quality classification of pure robusta coffees, which represents a fast growing market. Acknowledgements The authors acknowledge financial support from the following Brazilian Government Agencies: CNPq (Grant # 475746/2013-9; 306139/2013-8; 150039/2014-0) and FAPEMIG (Grant # PPM-0061915). We would like to thank the reviewers for their valuable comments and suggestions. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.foodchem.2017.11.066. References Alves, R. C., Casal, S., Alves, M. R., & Oliveira, M. B. (2009). Discrimination between arabica and robusta coffee species on the basis of their tocopherol profiles. Food Chemistry, 114, 295–299. Barbin, D. F., Felicio, A. L. S. M., Sun, D.-W., Nixdorf, S. L., & Hirooka, E. Y. (2014). Application of infrared spectral techniques on quality and compositional attributes of coffee: An overview. Food Research International, 61, 23–32. Bee, S., Brando, C. H. J., Brumen, G., Carvalhaes, N., Kolling-Speer, I., Speer, K., ... Vitzhum, O. G. (2005). The raw bean. In A. Illy, & R. Viani (Eds.). Espresso coffee, the science of quality (pp. 87–178). San Diego: Elsevier Academic Press. Belchior, V., Franca, A. S., & Oliveira, L. S. (2016). Potential of diffuse reflectance infrared Fourier transform spectroscopy and chemometrics for coffee quality evaluation. International Journal of Food Engineering, 2, 1–8. Bertone, E., Venturello, A., Giraudo, A., Pellegrino, G., & Geobaldo, F. (2016). Simultaneous determination by NIR spectroscopy of the roasting degree and Arabica/
1060
Food Chemistry 245 (2018) 1052–1061
A.P. Craig et al.
Robert, P., Marquis, M., Barron, C., Guillon, F., & Saulnier, L. (2005). FT-IR investigation of cell wall polysaccharides from cereal grains. Arabinoxylan infrared assignment. Journal of Agricultural and Food Chemistry, 53, 7014–7018. Romano, R., Santini, A., Le Grottaglie, L., Manzo, N., Visconti, A., & Ritieni, A. (2014). Identification markers based on fatty acid composition to differentiate between roasted Arabica and Canephora (Robusta) coffee varieties in mixtures. Journal of Food Composition and Analysis, 35, 1–9. Silverstein, R. M., Webster, F. X., Kiemle, D. J., & Bryce, D. L. (2015). Spectrometric identification of organic compounds. John Wiley & Sons Inc. Speer, K., & Kölling-Speer, I. (2006). The lipid fraction of the coffee bean. Brazilian Journal of Plant Physiology, 18, 201–216. Thaler, H. (1979). The chemistry of coffee extraction in relation to polysaccharides. Food Chemistry, 4, 13–22. Wise, B. M., & Gallagher, N. B. (2013). PLS toolbox software user guide. Eigenvector Research Inc. Zanin, R. C., Corso, M. P., Kitzberger, C. S. G., Scholz, M. B. S., & Benassi, M. T. (2016). Good cup quality roasted coffees show wide variation in chlorogenic acids content. LWT – Food Science and Technology, 74, 480–483.
green coffee: comparison of graded and defective beans. Journal of Food Science, 72, S333–S337. Reis, N., Botelho, B. G., Franca, A. S., & Oliveira, L. S. (2017). Simultaneous detection of multiple adulterants in ground roasted coffee by ATR-FTIR spectroscopy and data fusion. Food Analytical Methods, 11, 1–10. Reis, N., Franca, A. S., & Oliveira, L. S. (2013). Discrimination between roasted coffee, roasted corn and coffee husks by diffuse reflectance infrared Fourier transform spectroscopy. LWT – Food Science and Technology, 50, 715–722. Reis, N., Franca, A. S., & Oliveira, L. S. (2016). Concomitant use of Fourier transform infrared attenuated total reflectance spectroscopy and chemometrics for quantification of multiple adulterants in roasted and ground coffee. Journal of Spectroscopy, 2016, 1–7. Ribeiro, J. S., Augusto, F., Salva, T. J. G., & Ferreira, M. M. C. (2012). Prediction models for Arabica coffee beverage quality based on aroma analyses and chemometrics. Talanta, 101, 253–260. Ribeiro, J. S., Ferreira, M. M. C., & Salva, T. J. G. (2011). Chemometric models for the quantitative descriptive sensory analysis of Arabica coffee beverages using near infrared spectroscopy. Talanta, 83, 1352–1358.
1061