Prediction of selected rheological characteristics of wheat based on Glutopeak test parameters

Prediction of selected rheological characteristics of wheat based on Glutopeak test parameters

Journal of Cereal Science 91 (2020) 102898 Contents lists available at ScienceDirect Journal of Cereal Science journal homepage: http://www.elsevier...

826KB Sizes 0 Downloads 14 Views

Journal of Cereal Science 91 (2020) 102898

Contents lists available at ScienceDirect

Journal of Cereal Science journal homepage: http://www.elsevier.com/locate/jcs

Prediction of selected rheological characteristics of wheat based on Glutopeak test parameters Bogna Zawieja a, Agnieszka Makowska b, *, Mateusz Gutsche c a

Department of Mathematical and Statistical Methods, Pozna� n University of Life Sciences, Wojska Polskiego 28, Poznan, 60-637, Poland Institute of Food Technology of Plant Origin, Pozna� n University of Life Sciences, Wojska Polskiego 28, Poznan, 60-637, Poland c GoodMills Poland, Serbska 4, 61-596, Pozna� n, Poland b

A R T I C L E I N F O

A B S T R A C T

Keywords: Wheat Classical rheological parameters GlutoPeak Sequential Logistic and PLS regression models

The GlutoPeak®-Test, a new rapid small-scale technique, was proposed as an alternative method for evaluation of wheat grain and tool for predicting the wheat flour quality. Samples obtained from an industrial mill were analyzed by a GlutoPeak test (whole grain flours) as well as by farinograph and extensograph tests (refined flours). Firstly, linear correlation coefficients between water absorption, dough stability, dough energy and defined parameters of GlutoPeak were calculated. Next, a sequential multiple quadratic regression analysis (backward, forward and stepwise), a logistic regression analysis and a PLSR analysis were applied. The corre­ lation between the flour water absorption and most of the parameters obtained from the GlutoPeak test were strong (r � 0.74, p < 0.001). For stability, the r value was 0.40, while for energy it was 0.44. Based on the obtained results it could be state that in the case of water absorption, the best fit was the sequential regression model, for dough stability sequential regression model and the PLSR model had the best fit, whereas logistic regression model was the best fitted to the energy. Unfortunately, after cross validation it was found that the last model is not good enough for energy prediction.

1. Introduction Wheat flour is intended for bread baking, biscuits, cookies and many other products. The purpose of the flour depends on its properties which result from chemical composition, as well as interaction between flour components. Flour properties are connected with the quality of grain and usually are determined by means of classical rheological tests, such as farinographic, alveographic or extensographic analyses (Bekes, 2012; Dobraszczyk and Morgenstern, 2003). Unfortunately, a full character­ istic of wheat grain on the basis of classical tests is time consuming – it takes about 3 h (Marti et al., 2015a). Due to such a long process of grain quality assessment the stage of its acceptance into the mill is very long. To streamline this process it is necessary to find new rapid methods of grain quality evaluating. High hopes in this area have been by the newly developed shear-based GlutoPeak test (Chandi and Seetharaman, 2011). In this instrument the measurement of rheological parameters (peak maximum time, maximum torque, torque before maximum, torque after maximum, areas between: A0 and A1, A1 and A2, …A4 and A5) takes about 10 min. Over the last few years, in order to indicate the most appropriate condition of the GlutoPeak analysis, some researches

concerning sample granulation, sample/solvent weight ratio, mixing speed, testing temperature or kind of solvent were carried out (Bouachra et al., 2017; Chandi and Seetharaman, 2011; Malegori et al., 2018; Rakita et al., 2018; Wang et al., 2018). GlutoPeak is a relatively new tool for gluten aggregation measure­ ment. Marti et al. (2015b) stated that the GlutoPeak results are corre­ lated with gluten protein composition of flour and that this test could be an alternative to the labor-intensive quantitation of quality related protein fractions of wheat flour. GlutoPeak either could be proposed as a screening method for breeders for determining the quality of wheat lines at early stages (Malegori et al., 2018) or for bakers as a tool for pre­ dicting of the end product quality (Marti et al., 2015b). It also could be a fast and reliable method for checking the wheat grain quality at the receiving station, thanks to which the process of grain reception in the mill would be streamlined and shortened (Marti et al., 2015a). In order to replace traditional methods with this new test, millers would like to know how the parameters obtained with this instrument are correlated with the parameters obtained with conventional rheological tests. They need an algorithm to recalculate the values derived with the new in­ strument to parameters that they are already acquainted with. So far,

* Corresponding author. E-mail address: [email protected] (A. Makowska). https://doi.org/10.1016/j.jcs.2019.102898 Received 7 October 2019; Received in revised form 6 December 2019; Accepted 17 December 2019 Available online 23 December 2019 0733-5210/© 2019 Elsevier Ltd. All rights reserved.

B. Zawieja et al.

Journal of Cereal Science 91 (2020) 102898

there have been few papers on this subject in the literature. Sissons (2016) analyzed durum wheat semolina by GlutoPeak test and con­ ventional methods and they found that the GlutoPeak parameter PMT is suited to separating weak and strong gluten index samples. Also Rakita et al. (2018) confirm that GlutoPeak test is proved to be successful in discriminating wheat varieties of good quality from those of poor quality. Bouachra et al. (2017) developed linear model for loaf volume in Micro Baking Test, which was a function of protein content in flour and the parameter TbM from GlutoPeak test. The evaluation of this function with an independent data set led to 64% correct and to 36% incorrect predictions. They therefore concluded that the model can be a useful for wheat classification but it cannot be expected to replace the actual measurement of loaf volume via standard baking test. Marti et al. (2015b) developed multivariate Partial Least Squares Regression models using GlutoPeak parameters for prediction classical rheological parameters (farinographic, alveographic, extensographic). They analyzed large amount of common wheat flours (120 samples) characterized by different end-use and stated that these models allow to predict values of 16 from 26 conventional rheological parameters on the basis of GlutoPeak indices. Rakita et al. (2018) examined hard winter flours and showed statistic significant correlation between GlutoPeak indices and farinographic, alveographic, mixographic characteristics and breadmaking properties of the flours. They developed the backward regression model for WA prediction and stated that this model accounted for 80.5% of the vari­ ance in the WA variable. The similar model for dough stability explained only 59.9% of variability. Malegori et al. (2018) used GlutoPeak test for characterization of wholegrain and refined wheat flour. After the multivariate analysis they found that the GlutoPeak test is able to clas­ sify wheat according to dough stability, which is widely used for defining wheat end-uses. The researchers divided the flours for two classes: 1st class (stability higher that 17.5 min) and 2nd class (stability lower that 17.5 min) and developed k-Nearest Neighbours classification models using the GlutoPeak profiles to predict farinographic stability. The average prediction ability of this models accounted 81.8% for both refined and whole grain flour data. It suggested the possibility of pre­ dicting farinographic stability by GlutoPeak test directly on whole grain flour, thus skipping the refinement process. The aim of this study is to compare several statistical models used for prediction classical farinograph (water absorption, stability) and extensograph (energy) parameters of refined flour dough on the basis of parameters obtained from the GlutoPeak test of whole grain flours and indication of the best of these models. Usually in similar experiments strong and weak flours were study. In our research weak and medium flours were examined. Moreover our approach is unique because sequential methods of multiple quadratic regression analysis and logistic regression analysis have not yet been widely used for predicting of classical rheological characteristics on the basis of GlutoPeak test results.

carried out with the GlutoPeak instrument (Brabender GmbH and Co KG, Duisburg, Germany). 8 g of flour (14% m.b.) was dispersed in 10 mL of 2% solution of NaCl in a stainless steel sample cup. The samples were analyzed at a rotating paddle speed of 2750 rpm and temperature of 36 � C. The following parameters were measured: peak maximum time (PMT), maximum torque (MT), torque 15 s before maximum (TbM), torque 15 s after maximum (TaM), areas between A0 and A1 (A0_1), A1 and A2 A1_2, …, and A4 and A5 (A4_5). The farinograph analysis of 63 samples of flour was conducted with Farinograph®-E (Brabender GmbH and Co KG, Duisburg, Germany) according to the AACC Method no. 54–21.02 (AACC International, 2011b). Among reordered parameters water absorption (WA) \ and dough stability (S) were taken into account for further studies. The extensograph analysis of 550 refined flour samples was carried out with Extensograph®-E (Brabender GmbH and Co KG, Duisburg, Germany) according to the AACC Method no. 54–10.01 (AACC Inter­ national, 2011a). Dough energy (E) after 45 min resisting time was noted and used for further calculations. 2.3. Statistical methods Linear correlation coefficients were calculated between parameters received from conventional instruments and GlutoPeak. Next sequential regression (SR), sequential logistic regression (LR), and partial least squares regression (PLSR) methods were used to convert GlutoPeak parameters (explanatory variables) into WA, S and E (response variables). The multiple SR model was the following y ¼ β0 þ β1 x1 þ β2 x2 þ … þ βp xp þ βpþ1 x21 þ βpþ2 x22 þ … þ β2p x2p þ e

(1)

y – the response variable (WA, S or E), xl (l ¼ 1, 2, …, p) – explanatory variables (parameters obtained from GlutoPeak), p – number of explanatory variables (p ¼ 9), βl – regression coefficients, e– error. If many independent variables are analyzed, the regression analysis model can be complicated, which could cause interpretation difficulties. Therefore, one of the sequential procedures (backward, forward and stepwise) of selecting variables was used. The choice of the SR model was dependent on the adjusted determination coefficient - R2, the Akaike - AIC criterion (Akaike, 1973) and root mean square of error (RMSE). For the three obtained equations, the R2 and AIC criterion were designated. The model with the smallest AIC criterion and the largest R2 value was selected as the best fit to the data. In the case of discrepancies between the results of both criteria the result of AIC was used, because R2 is strongly positive in relation to the number of included explanatory variables. The categorization of flour quality and purpose parameters (WA, S and E) was conducted on the basis of the millers’ scale of corresponding parameters (Table S1). Next the sequential (backward and forward) LR analyses were applied to them. The response variable was given in the ordinal scale, but explanatory variables constituted the measurable data. In this situation the multinomial logistic regression model was used:

2. Material and methods

γi

(2)

2.1. Material

ηðγij Þ ¼ logit γi ¼ log

A set of 552 samples of commercial wheat grain obtained from the GoodMills Poland Company in Grodzisk Wielkopolski constituted the material for this experiment. For the GTP analysis samples of wholegrain flours were prepared in Laboratory Mill 3100 (Perten Instruments, €gersten, Sweden). Samples of refined flour (yield 60–65%) for Ha Extensograph and Farinograph analyses were milled in CD1 Mill (Cho­ pin Technologies, Villeneuve-la-Garenne Cedex, France).

where γi ¼ π1 þ π2 þ … þ πi was the ith cumulative probability, θi was the cut point of the ith category, βl was the effect of the lth variable. The probability corresponding to ith category was calculated as (Agresti, 2010): pi ¼

1

expðθi þ β’xÞ 1 þ expðθi þ β’xÞ

γi

¼ θi þ β’x ¼ θi þ β1 x1 þ β2 x2 þ … þ βp xp

expðθi 1 þ β’xÞ 1 þ expðθi 1 þ β’xÞ

(3)

To test the hypothesis Ho: β ¼ 0, the Wald test statistics were applied which, under the null hypothesis, has an approximate χ 2 distribution (McCulloch and Searle, 2001). The choice of the logistic regression model was dependent on the AIC and scaled χ 2 criteria (quotient of the

2.2. Rheological tests The GlutoPeak analysis of 552 wholegrain wheat flour samples was 2

B. Zawieja et al.

Journal of Cereal Science 91 (2020) 102898

likelihood of the matched model and the likelihood of the zero model, divided by degree of freedom). The model with the smallest AIC crite­ rion and χ 2 value close to 1 was selected (McCullagh and Nelder, 1989) as the model the best fit for the data. In this study mainly the AIC cri­ terion was used. For the above two models, the set of independent variables has been reduced to non collinear variables. In PLSR one or several responses variables, Y ¼ [Y1, Y2, …Ym]’ could be modeled, by means of a set of independent variables X ¼ [X1, X2, … Xp]’ where Yl and Xk were N-vectors observation of lth (l ¼ 1, 2, …, m) and kth (k ¼ 1, 2, …, p) variable. The PLSR model was the following Y ¼ XB þ E,

55.8–74.9%. Based on the results of classical rheological tests it can be said that the tested flours were characterized by medium and weak gluten (S ranged between 1.9 and 12.5 min average 5.5 min and E ranged between 0 and 149 cm3, average 32 cm3). It could be notices that in some case the E values equaled 0 despite that the results of farino­ graphic tests indicated the protein network in the dough was not very weak. As reported by mill laboratories that situation could take place when grain was milled immediately after harvest. At the first step of the analysis, correlation coefficients between GlutoPeak parameters and selected classic rheological properties were calculated (Table 1). On the basis of obtained results it was stated that WA was very strongly positively correlated with such GlutoPeak pa­ rameters as MT, A4_5 and A3_4, as well as TaM and negatively with PMT. It was in accordance with the study of Rakita et al. (2018). High cor­ relations between GlutoPeak MT and WA of refined flour were likewise stated in Fu et al. (2017) and Michel et al. (2017) studies. A similar relationship in the research of whole grain flour of various wheat species was also noted by Geisslitz et al. (2018). Comparing S and GlutoPeak parameters significant correlations were shown between S and TaM, A2_3, A3_4, A4_5 and TbM, but they did not exceed 0.40. It was in agreement with the results obtained by Geisslitz et al. (2018). In the case of E most obtained correlation coefficients were close to zero but despite this, due to the large number of observations, many of them were statistically significant. Dependencies between extensograph and GlutoPeak properties were studied by Fu et al. (2017), as well as Rakita et al. (2018). They reported that GlutoPeak MT, and other properties were independent of dough strength measured by the extensograph, but Marti et al. (2015a) found strong relationship be­ tween Glutopeak properties and E and maximum resistance to extension.

(4)

Where B ¼ WQ. The matrices W and Q were obtained using PCA and the regression analysis. PLSR balances the two objectives of explaining response variation (regression analysis) and explaining predictor vari­ ation (PCA). Two different formulations for PLSR can be used: the original predictive method of Wold (1966) and the SIMPLS method of de Jong (1993). In this paper the Wold method was used. The split-sample validation method was carried out to detect the best number of factors (numbers of column matrix W) in model with the PRESS (minimized predicted residual sum of squares) statistics. The significance was marked by * for α ¼ 0.05; ** for α ¼ 0.01 and *** for α < 0.0001. All determined models were evaluated by the cross-validation method. The tested grain samples were randomly divided into two parts. The model was determined on the basis of the first part and the second part of the samples was classified on the basis of this model and vice versa. For this purpose the results of the regression and PSLR models of both parts of grain samples and results were classified (Table S1). 3. Results and discussion

3.1. Sequential regression models (SR)

In Table S2 descriptive statistics of the analyzed GlutoPeak param­ eters and selected farinograph (WA and S) and extensograph (E) prop­ erties were presented. During the GlutoPeak test the slurry of flour and water was subjected to intense mechanical shearing by paddle which rotated at a pre­ determined speed. In those conditions gluten network was formulated in the suspension. The counter torque resulting from the gluten aggrega­ tion upon mixing and the time required to reach peak resistance were registered as a torque curve. In previous studies significant correlations between some GlutoPeak parameters (PMT, MT Peak areas) and sedimentation value, protein content, WA and dough properties such as E, dough mixing time, S, tenacity, extensibility, W value, as well as bread volume were found (Bouachra et al., 2017; Geisslitz et al., 2018; Güçbilmez et al., 2019; Huen et al., 2017; Marti et al., 2014, 2015a, 2015b; Michel et al., 2017). Refined wheat flour had usually been analyzed with the GlutoPeak test, but its application was recently extended also to whole meal flours (Malegori et al., 2018). In our research results of the GlutoPeak analysis of wholegrain wheat flours were used to predict some selected rheological properties of dough obtained from refined flour (WA, S and E). Moreover, the Glu­ toPeak analyses were carried out with addition of NaCl (2%). This protocol was found to be applicable for analyzing weak flours (Rakita et al., 2018). When NaCl is added to flour during mixing, it shields the charges on the gluten molecules, lowers the repulsion between proteins and allows protein chains to strongly connect to each other which im­ proves gluten network formation (Miller and Hoseney, 2008). In our experiment rheological parameters of whole grain slurries measured with the GlutoPeak test (PMT, MT, TbM, TaM and specific areas) changed in a relatively wide range on the basis of variation coefficients (cv) (Table S2). WA of the analyzed refined flours varied widely in range

Among the regression models the best fit (the smallest AIC) was for the stepwise procedure in the case of both WA and S, while for E the AIC for backward procedure it was the smallest, thus this model was recognized as the best (Table 2). Regression equations obtained in the sequential procedure of the multiple square regression were the following: WA ¼ 78.935***– 0.284***PMT þ 0.139**MT – 0.015 A0_1 þ 0.003***A2_3 þ 0.0005***PMT 2 S ¼ 1.865 þ 0.006***A1_2 þ 0.003***A2_3–0.0001*** PMT 2 E¼

(5)

63.764*** þ 0.815*** PMT þ 0.609 TbM þ 0.801*TaM þ 0.029 A0_1

– 0.020**A1_2–0.001***PMT 2 – 0.007**MT 2 – 0.007 TbM 2 A significant quadratic dependency of considered classical rheolog­ ical properties on PMT was shown. In the case of WA this dependency was decreasing for PMT from 32 to 150 and it was stabilized for PMT larger than 150. The opposite situation occurred in the case of S and E. If PMT values increased from 32 to 150, S also increased but if PMT was larger than 150 than S slightly decreased. It was difficult to see any dependency on E but the pattern of points rather took the shape of a parabola. Moreover, a linear positive dependency on A2_3 of all these free properties was detected. It should be noted that most of the pa­ rameters specified by GlutoPeak were needed to assess E. In the paper by Rakita et al. (2018) only the backward regression method was used to determine dependency between rheological/­ breadmaking properties of wheat flour and GlutoPeak parameters. In this experiment among GlutoPeak parameters PMT, MT, aggregation energy and aggregation time were considered. WA depended on ag­ gregation energy and time (R2 ¼ 0.814), S depended on MT and aggre­ gation energy (R2 ¼ 0.618) while E was not related to any GlutoPeak indices. The greater part of the variability of WA was explained in the 3

B. Zawieja et al.

Journal of Cereal Science 91 (2020) 102898

Table 1 Pearson correlation coefficients between parameters from farinograph, extensograph and GlutoPeak parameters. Farinograph and Extensograph Parameters Water absorption (WA) Stability (S) Energy (E)

GlutoPeak Parameters r p r p r p

PMT

MT

TbM

TaM

A0_1

A1_2

A2_3

A3_4

A4_5

¡0.74* 0.000 0.03 0.839 0.45* 0.000

0.85* 0.000 0.22 0.0823 0.31* 0.000

0.26* 0.040 0.27* 0.032 0.04 0.318

0.79* 0.000 0.40* 0.001 0.22* 0.000

0.66* 0.000 0.12 0.356 0.22* 0.000

0.65* 0.000 0.09 0.499 0.13* 0.003

0.12 0.340 0.40* 0.001 0.37* 0.000

0.81* 0.000 0.34* 0.007 0.24* 0.000

0.83* 0.000 0.30* 0.016 0.28* 0.000

PMT - Peak maximum time; MT - Maximum torque; TbM -Torque before maximum; TaM -Torque after maximum; A0_1 - Area between A0 and A1; A1_2 - Area between A1 and A2, …, A4_5 - Area between A4 and A5; r-correlation coefficient; p-probability. Table 2 Parameters of model fit. Water absorption (WA) MULTIPLE SQUARE REGRESSION MODEL Type of procedure m Stepwise 5 Backward 8 Forward 10 LOGISTIC REGRESSION MODEL Type of procedure m Backward 1 Forward 1

Stability (S)

Energy (E)

2

R 0.862 0.861 0.853

AIC 88.39 93.00 92.05

m 3 7 9

R 0.337 0.269 0.340

AIC 78.92 91.45 83.90

m 7 8 10

R2 0.257 0.288 0.296

AIC 3566.51 3566.03 3569.58

χ2

AIC 20.72 20.72

m 4 1

χ2

AIC 93.76 109.81

m 4 2

χ2

AIC 1209.46 1240.29

0.155 0.155

2

0.657 1.096

1.030 0.961

bold font - best values for fit parameters. m - Number of independent variables in model.

model presented in our study but the variability of S designated in model by Rakita et al. (2018) explained greater part of the variability. PMT in the cited study ranged from 80 s to 405 s and both the lower and upper limits were much higher than limits received from Glutopeak obtained in our study (15–288 s) while MT ranged from 22 BU to 58 BU (our interval was larger: 7–72 BU). The values of WA, S, E changed in a slightly different range (51.2–64.3%, 1.2–20.5 min and 15–136 cm2 respectively) but the highest differences were observed for PMT.

logit γ1 logit γ2 logit γ3 logit γ4

¼ 2:260 ¼ 4:116*** ¼ 5:367*** ¼ 7:688***

0:016*** PMT 0:016*** PMT 0:016*** PMT 0:016*** PMT

0:149 *** MT 0:149 *** MT 0:149 *** MT 0:149 *** MT

0:165*** TaM 0:165*** TaM 0:165*** TaM 0:165*** TaM

logit γ3 ¼ logit γ4 ¼

7:762*** þ 0:028*** PMT 5:202*** þ 0:028*** PMT

(6)

For S: logit γ1 ¼ 0:151*** 0:002*** A2 3 logit γ2 ¼ 3:700*** 0:002*** A2 3 logit γ3 ¼ 5:468*** 0:002*** A2 3

(7)

For E:

0:001*** A2 0:001*** A2 0:001*** A2 0:001*** A2

3 3 3 3

(8a)

3.2. Logistic regression models (LR) Before this analysis the flour samples were qualified into classes (Table S1). The estimators of WA, S and E, obtained for LR models were presented in Table 2. The AIC criterion value was the same for the backward and forward procedure in the case of WA and also the χ2 criteria were equal (Table 2), but unfortunately χ2 value was very small which indicated a very poor fit of the model to the data. Taking into account S and E, the AIC statistics were smaller for backward than for the forward procedure (Table 2), so the built models for describing S and E using this method were better fit to the data. For E, the χ2 criterion was close to 1 admittedly in both cases of models (backward and forward), but it was the nearest 1 in the backward procedure. For S this criterion was close to 1 in the forward procedure for the backward one it was too small. Thus result from forward selection was considered. The equations of LR adequate for individual rheological properties were the following: For WA:

From formula (3) the probabilities for WA (p3, p4 and p5, connected with classes 3, 4, and 5 respectively); for S (p1, p2, p3 and p4, connected with classes 1, 2, 3, and 4) and for E (p1, p2, p3, p4 and p5, connected with classes 1, 2, 3, 4 and 5) were calculated separately for each sample of grain. The grain was in the category for which the probability was maximal. 3.3. Partial Least Squares Regression models (PLSR) Before application of the PLSR method, PCA had been applied to establish the number of variables which were included in the model. The first two principal components explained 80% of total variability (Table S3). From variables presented in Fig. 1 those least correlated, best-spanning space and having the largest variance variables were selected, namely PMT, MT, TbM, TaM and Area A2_3. WA was charac­ terized by high variability (the long vector in Fig. 1) but S and E were characterized by small variability. WA was strongly correlated with MT and A0_1, S was strongly correlated with TbM and E was correlated with 4

B. Zawieja et al.

Journal of Cereal Science 91 (2020) 102898

obtained from GTP Marti et al. (2015b) obtained much better results of determining old parameters using the new ones (WA: R2 ¼ 0.91; S: R2 ¼ 0.88 and E: R2 ¼ 0.78). Our result for WA was similar, the differences in the percentage of explained variability for S and E could result from differences in the variability ranges of these parameters in both studies. 3.4. Cross validation procedures for model selection In this part of the study the results of measurements were divided into two data sets (the first set – teaching samples and the second set – learning samples). The models were obtained on the basis of the first set, while the second set was classified on the basis of the calculated model. Next, the roles of the data sets were switched (the second set – teaching samples and the first set – learning samples). In order to do cross validation the data and also predictors (obtained from models) had to be classified using the classification given in Table S1. The percentage of correctly classified cases (samples of grain) ranged from 45 to 100 when the classification was performed using the coefficients of models designated for all data sets (Table 4). The per­ centage of correctly classified cases ranged from 53 to 97 when the coefficients of models were designated using parts of data (cross vali­ dation). The best classification was obtained for WA. Regardless of the model used the cross validation gave minimum 90% of correct classifi­ cations. SR turned out to be the best. Samples from class 5 were always assigned to class 5 (Fig. 2) but only with SR the cases belonging to class 3 were assigned to this class. The numbers of parameters included in the models for WA differed (see Equations (4), (5) and (8)), PMT was included in each model. Moreover, in the logistic models only this parameter was included (percentage of correctly classified cases of grain was the smallest (90%). Most parameters were included in PLSR (due to the specificity of this model the number was the same for all tested grains) but the percentage of correctly classified cases of grain was only 92. Thus it was noted that too many and too few parameters did not improve the model’s properties (evaluated on the basis of cross valida­ tion). This result was confirmed when we compared the goodness of fitting models to the data (Table 4). Namely, the explained variability (R2) of the response variable by SR was 0.86; by PLSR it was 0.52 and χ2 for LR it was only 0.16 (while AIC was smaller when for SR). Moreover RMSE had the smallest value in the case of the SR method (Table 4). Thus the SR model was the best for WA. In the case of S the percentage of correct classifications was similar for each model and ranged from 69 to 71 when cross validation method was used (Table 4). Thus each model gave similar results. However, in the SR and PLSR methods none of the samples were classified to class 4 (samples from this class were assigned to classes 2 and 3 and in PLSR also to class 1) and using the LR model none of the samples were clas­ sified to class 3 (Fig. 2). All elements belonging to the class 4 were assigned to the class 2 by the SR and LR models, but PLSR gave a different result of cross validation: all elements belonging to class 4 were assigned into classes 1 and 2. The goodness of fit parameters (Table 4) was as follows: the percentage of explained variability (R2⋅100%) of the response variable was 34 for SR, 52 for PSLR and χ2 for LR was 1.10. AIC was smaller for the SR model than for the LR model. RMSE was the smallest for the SR model (Table 4). Summarizing, with such a difficult choice, the PLSR or SR model should be selected but due to a very small percentage of the used variability of explanatory variables (29) the SR was recognized as better. The cross classification for E was the worst and ranged from 53% to 56%. As it was mentioned in the Methods paragraph only data for which all parameters were measured (by new and old instruments) could be used for building the PLSR model. This resulted in the fact that while building a model not all grain samples for which dough E was measured were included. However, all grain samples were used for cross valida­ tion. In all methods most of the samples of grain from classes 3, 4 and 5 were assigned to classes 1 and 2, it was the strongest for SR (Fig. 2). The first and second classes were scattered across classes 1–4, but for SR also

Fig. 1. Parameters presented on the plane of the first two prin­ cipal components. Table 3 Summary of partial least squares analysis PLS. Factor

Mean R2 for Y

Mean R2 for X

R2 for Water absorption (WA)

R2 for Stability (S)

R2 for Energy (E)

1 2 3 4 5

0.262 0.466 0.490 0.520 0.525

0.513 0.781 0.929 0.955 0.992

0.747 0.747 0.749 0.778 0.792

0.025 0.208 0.231 0.292 0.292

0.0158 0.443 0.490 0.490 0.490

A2_3. In PLSR all selected independent variables were used both in linear and square forms. Since in this analysis the number of included factors has a huge impact on the result of analysis, the split-sample validation method (PRESS) was used to determine the number of factors. The smallest PRESS (0.824) was received for five factors, thus these factors were used in the PLSR analysis. In Table 3 the values of statistics R2 depending on the number of factors were presented. For independent variables X, R2 was 0.99 for five factors and for dependent variables R2 was 0.52. Designated equations were the following (as it was mentioned in the Methods in PLSR the significance of coefficients was not designated): WA ¼ 64.822–0.044þPMT þ 0.121þMT þ 0.011 TbM – 0.013þTaM – 0.0002 A2_3 þ 0.0001þ PMT2 þ 0.001þMT2þ 0.0001 TbM2 – 0.001þ TaM 2þ 0.00001 A2_32 S ¼ 1.152 þ 0.008þPMT – 0.140 MT þ 0.005 TbM þ 0.089 TaM þ 0.0011þA2_3 – 0.00001 PMT2 – 0.0005 MT2 – 0.0002 TbM2 þ 0.001 TaM2

(8b)

E ¼ 25.798 þ 0.052 PMT – 0.060 MT–0.023 TbM þ 0.257 TaM þ 0.0018þ A2_3 þ 0.00009 PMT2 – 0.0008 MT2 – 0.001 TbM2 þ 0.003 TaM 2þ 0.00007þ A2_32 The largest importance on the basis of the VIP criterion (VIP>1) was marked by þ. In the paper by Marti et al. (2015b) the ranges of variation of parameters PMT, MT and WA were smaller than the ranges presented in our researches (Table S2). S values obtained in their experiment were higher than ours and classified in 3rd and 4th classes (Table S1). Our data was classified in 1st, 2nd and 3rd classes. E values of samples presented by Marti et al. (2015b) were classified in classes 3rd, 4th, 5th but in our research in all classes. Despite similar ranges of parameters 5

B. Zawieja et al.

Journal of Cereal Science 91 (2020) 102898

Table 4 % of correctly classified samples after using estimated parameters in the regression analysis and goodness of fit criterions. Model

SR full model

Water absorption (WA) Stability (S) Energy (E) Model Water Absorption (WA) Stability (S) Energy (E) a b

LR a

b

100 73 45

PLSR

cross validation

full model

cross validation

full model

cross validation

97 69 55

92 69 53

90 71 56

93 75 73

92 69 53

SR

LR

PLSR

AIC

R2

RMSE

AIC

χ2

RMSE

R2

RMSE

88 79 3566

0.862 0.337 0.288

1.85 1.77 26.81

21 110 1204

0.16 1.10 1.03

4.03 2.05 1.68

0.788 0.291 0.493

2.21 1.88 33.90

model built on the basis all cases. N models built on the basis n-1 cases (where N is the number of cases).

in class 5. R2 (Table 4) for PLSR was larger (0.52) than for SR (0.29), and the AIC criterion was smaller for LR (1204) than for SR (3566) and χ2 was close to 1. In connection with the above and cross validation results among all models the LR model was selected as the best for E assessment. Moreover, RMSE was the smallest for this method while for the SR and the PLSR methods RMSE had very high values. Unfortunately, this model is still not satisfactory enough. 4. Conclusions The results of our work prove that it is possible to predict some conventional parameters of dough from refined flour due to analysis of whole grain flour in GlutoPeak test. This new rapid test could be a useful tool for the classification of flours with a weak and medium gluten system intended for pastry or waffle production. Among all the devel­ oped models the best one for predicting water absorption was sequential regression model. For dough stability the best was PLSR, as well as sequential regression model, while for energy the best fitted was logistic regression model. On the basis of cross validation it was stated that it is possible to determine the WA class of the examined flour with high accuracy using the parameters obtained from wholegrain flour Gluto­ Peak test. However, the remaining results for S and E were not as promising despite the use of a non-linear method. In the case of dough S 71%, and for dough E only 56% of the samples were correct classified. It indicated that the developed model for dough energy does not allow to correctly classify flour based on the GlutoPeak test results. Therefore, further research will be conducted to develop better models for which will enable predicting dough stability and energy, as well as other classical rheological parameters based on the GlutoPeak test profiles. Formatting of funding sources The publication was co-finances within the framework of Ministry of Science and Higher Education programme as “Regional Initiative Excellence” in years 2019–2022. Project No. 005/RID/2018/19. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. CRediT authorship contribution statement Fig. 2. Cross validation classification – vertical axis; true classification bar colours: -1, -2, -3, -4, -5.

Bogna Zawieja: Conceptualization, Methodology, Formal analysis, Writing - original draft, Writing - review & editing. Agnieszka Makowska: Conceptualization, Methodology, Writing - original draft, Writing - review & editing. Mateusz Gutsche: Conceptualization, Resources. 6

B. Zawieja et al.

Journal of Cereal Science 91 (2020) 102898

Appendix A. Supplementary data

Huen, J., Boersmann, J., Matullat, L., Boehm, L., Stukenborg, F., Heitmann, M., Zannini, E., Arendt, E.K., 2017. Wheat flour quality evaluation from the baker’s perspective: comparative assessment of 18 analytical methods. Eur. Food Res. Technol. 244, 535–545. Malegori, C., Grassi, S., Ohm, J.-B., Anderson, J., Marti, A., 2018. GlutoPeak profile analysis for wheat classification: skipping the refinement process. J. Cereal Sci. 79, 73–79. Marti, A., Augst, E., Cox, S., Koehler, P., 2015. Correlations between gluten aggregation properties defined by the GlutoPeak test and content of quality-related protein fractions of winter wheat flour. J. Cereal Sci. 66, 89–95. Marti, A., Ulrici, A., Foca, G., Quaglia, L., Pagani, M.A., 2015. Characterization of common wheat flours (Triticum aestivum L.) through multivariate analysis of conventional rheological parameters and gluten peak test indices. LWT - Food Sci. Technol. (Lebensmittel-Wissenschaft -Technol.) 64, 95–103. Marti, A., Cecchini, C., D’Egidio, M.G., Dreisoerner, J., Pagani, M.A., 2014. Characterization of durum wheat semolina by means of a rapid shear-based method. Cereal Chem. 91, 542–547. McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models, second ed. Chapman and Hall, London. McCulloch, C.E., Searle, S.R., 2001. Generalized, Linear, and Mixed Models. Wiley, New York. Michel, S., Gallee, M., L€ oschenberger, F., Buerstmayr, H., Kummer, C., 2017. Improving the baking quality of bread wheat using rapid tests and genomics: the prediction of dough rheological parameters by gluten peak indices and genomic selection models. J. Cereal Sci. 77, 24–34. Miller, R.A., Hoseney, R.C., 2008. Role of salt in baking. Cereal Foods World 53, 4–6. Rakita, S., Doki�c, L., Dap�cevi�c Hadnađev, T., Hadnađev, M., Torbica, A., 2018. Predicting rheological behavior and baking quality of wheat flour using a GlutoPeak test. J. Texture Stud. 49, 339–347. Sissons, M., 2016. GlutoPeak: a breeding tool for screening dough properties of durum wheat semolina. Cereal Chem. 93, 550–556. Wang, J., Hou, G.G., Liu, T., Wang, N., Bock, J., 2018. GlutoPeak method improvement for gluten aggregation measurement of whole wheat flour. LWT - Food Sci. Technol. (Lebensmittel-Wissenschaft -Technol.) 90, 8–14. Wold, H., 1966. Estimation of principal components and related models by iterative least squares. In: Krishnaiah, P.R. (Ed.), Multivariate Analysis. Academic Press, New York.

Supplementary data related to this article can be found at htt ps://doi.org/10.1016/j.jcs.2019.102898. References AACC International, 2011. Approved methods of analysis. In: Method 54-10.01. Extensograph Method, General. January 6, 2011, eleventh ed. AACC International, St. Paul, MN, U.S.A. AACC International, 2011. Approved methods of analysis. In: Method 54-21.02. Rheological Behavior of Flour by Farinograph: Constant Flour Weight Procedure. January 6, 2011, eleventh ed. AACC International, St. Paul, MN, U.S.A. Agresti, A., 2010. Analysis of Ordinal Categorical Data, second ed. Wiley, New York. Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (Eds.), Second International Symposium on Information Theory. Akademiai Kiado, Budapeszt. Bekes, F., 2012. New aspects in quality related wheat research: 1. Challenges and achievements. Cereal Res. Commun. 40, 159–184. Bouachra, S., Begemann, J., Aarab, L., Hüsken, A., 2017. Prediction of bread wheat baking quality using an optimized GlutoPeak®-test method. J. Cereal Sci. 76, 8–16. Chandi, G.K., Seetharaman, K., 2011. Optimization of gluten peak tester: a statistical approach. J. Food Qual. 35, 69–75. de Jong, S., 1993. SIMPLS: an alternative approach to partial least squares regression. Chemometr. Intell. Lab. Syst. 18, 251–263. Dobraszczyk, B.J., Morgenstern, M.P., 2003. Rheology and the breadmaking process. J. Cereal Sci. 38, 229–245. Fu, B.X., Wang, K., Dupuis, B., 2017. Predicting water absorption of wheat flour using high shear-based GlutoPeak test. J. Cereal Sci. 76, 116–121. Geisslitz, S., Wieser, H., Scherf, K.A., Koehler, P., 2018. Gluten protein composition and aggregation properties as predictors for bread volume of common wheat, spelt, durum wheat, emmer and einkorn. J. Cereal Sci. 83, 204–212. Güçbilmez, Ç.M., S¸ahin, M., Akçacık, A.G., Aydo� gan, S., Demir, B., Hamzao� glu, S., Yakıs¸ır, E., 2019. Evaluation of GlutoPeak test for prediction of bread wheat flour quality, rheological properties and baking performance. J. Cereal Sci. 90, 102827.

7