Artificial neural network for quantitative determination of total protein in yogurt by infrared spectrometry

Artificial neural network for quantitative determination of total protein in yogurt by infrared spectrometry

Microchemical Journal 91 (2009) 47–52 Contents lists available at ScienceDirect Microchemical Journal j o u r n a l h o m e p a g e : w w w. e l s e...

401KB Sizes 0 Downloads 11 Views

Microchemical Journal 91 (2009) 47–52

Contents lists available at ScienceDirect

Microchemical Journal j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / m i c r o c

Artificial neural network for quantitative determination of total protein in yogurt by infrared spectrometry Mohammadreza Khanmohammadi a,⁎, Amir Bagheri Garmarudi a,b, Keyvan Ghasemi a, Salvador Garrigues c, Miguel de la Guardia c a b c

Department of Chemistry, Faculty of Science, Imam Khomeini International University, Qazvin, Iran Department of Chemistry & Polymer Laboratories, Engineering Research Institute, Tehran, Iran Analytical Chemistry Department, Universitat de Valencia, Jerònim Muñoz building, C/Dr. Moliner, 50, 46100 Burjassot, Valencia, Spain

a r t i c l e

i n f o

Article history: Received 26 May 2008 Received in revised form 21 July 2008 Accepted 21 July 2008 Available online 25 July 2008 Keywords: Yogurt Protein content ATR-FTIR Artificial neural network Back propagation Successive Projection Algorithm

a b s t r a c t A method has been introduced for quantitative determination of protein content in yogurt samples based on the characteristic absorbance of protein in 1800–1500 cm− 1 spectral region by mid-FTIR spectroscopy and chemometrics. Successive Projection Algorithm (SPA) wavelength selection procedure, coupled with feed forward Back-Propagation Artificial Neural Network (BP-ANN) model was the benefited chemometric technique. Relative Error of Prediction (REP) in BP-ANN and SPA-BP-ANN methods for training set was 7.25 and 3.70 respectively. Considering the complexity of the sample, the ANN model was found to be reliable, while the proposed method is rapid and simple, without any sample preparation step. © 2008 Elsevier B.V. All rights reserved.

1. Introduction According to the US Food and Drug Administration (FDA), there are 3 existing US federal standards of identity for yogurt: non-fat yogurt, low-fat yogurt and yogurt. It is in the best interest of both yogurt manufacturers and consumers for the FDA to modernize these standards. Therefore, the FDA’s Center for Food Safety and Applied Nutrition (CFSAN) has a stated goal of developing a new rule that amends yogurt identity standards [1]. The main researches in field of yogurt nutritional parameters by mean of Fourier transform infrared (FTIR) spectroscopy, are related to the determination of cholesterol [2] and sorghum fermentation control [3]. One of the most important ingredients of yogurt is protein. Analytical methods for determination of protein in food products are generally based on Kjeldahl [4] and Lowry methods or related modified procedures [5,6] which utilize spectrophotometry. Reversed-Phase HPLC, capillary electrophoresis and diffuse reflectance infrared Fourier-transform spectrometry (DRIFTS) are some other techniques [7–9]. There are many other researches dealing with determination of protein content of different food products e.g. biuret reaction [10], 4th derivative UV spectrophotometry [11] and Near-Infrared Reflectance Spectroscopy [12]. Recently, it was tried to develop a fast and accurate method based on ⁎ Corresponding author. Department of Chemistry, Faculty of Science, Imam Khomeini International University, P.O. Box 288, Qazvin, Iran. Tel./fax: +98 281 3780040. E-mail address: [email protected] (M. Khanmohammadi). 0026-265X/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.microc.2008.07.003

ATR-FTIR spectroscopy which allows estimating the nutritional parameters of yogurt products, in order to provide an appropriate tool for the quality control of these products. Protein content of yogurt samples was determined applying partial least squares (PLS) chemometric technique [9]. Even though PLS assumes a linear relationship between the measured sample parameters and the intensity of its absorption bands, several authors have postulated that small deviations from linearity are acceptable as they can readily be suppressed by including additional modeling factors. However, in the presence of substantial non-linearity, PLS tends to give large prediction errors and calls for more suitable models. PLS is popular because it is able to resolve overlapping spectral responses. A similar statement also holds true for artificial neural networks (ANN). The interest in using artificial neural networks stems from the fact that concentration dependent relationships are not always linear. Intrinsically non-linear calibration techniques such as non-linear partial least squares, locally weighted regression (LWR) [13,14], alternating conditional expectations (ACE) and artificial neural networks (ANN) [15–17] are applicable in the latter cases. However, it is important to state that these methods are computationally more complex than linear methods, and they heavily depend on the amount and quality of data available. The object of this research was to develop a simple alternative to existing analytical methods for the quantification of yogurt protein content by using FTIR spectroscopy coupled with Successive Projection Algorithm (SPA), forward selection method which uses simple operations in a vector space to minimize variable

48

M. Khanmohammadi et al. / Microchemical Journal 91 (2009) 47–52

collinearity, as a variables selection methodology, and exploit the possibility of wavelength selection as a powerful tool to enhance the predictive ability. The choice of wavelength would critically affect the future predictive ability of the model. Recently, several selection methods have been developed, e.g. artificial neural network (ANN), Tabu search [18], hybrid linear analysis (HLA) [19], and Successive Projection Algorithm (SPA) [20]. In calculation power and being applicable for extremely complex problems, there are some publications dealing with genetic algorithms [21,22].

analysis. Samples were shaken vigorously to be as homogenized as possible before filling the ATR cell. After shaking, 4 replicate FTIR spectra were recorded for each sample at 4000–400 cm− 1 spectral region, averaging 49 scans per spectrum with a nominal resolution of 8 cm− 1. In order to minimize the influence of the signal due to water content of yogurt which would be centered around 1640 cm− 1, overlapping with the protein related absorbance bands, water was set as the background in all spectrometries (Fig. 1). 2.3. Analysis of protein related IR signals

2. Analytical methods 2.1. Apparatus and software All spectra recordings were performed by a Bruker® Tensor 27 FTIR spectrometer (Germany) which was supported by Opus software. The spectrometer was equipped with a Specac lamp IN-Compartment Contact 151 Sampler Horizontal ATR from Graseby Specac (Orpington, UK) was employed for spectral acquisition with a 45° crystal and a sixreflection ZnSe cell through the top-plate. The data obtained from WINFIRST software were exported in ASCII format, and transferred to a Pentium-IV PC for subsequent manipulation. All calculations were performed by MATLAB® software (version 7.1). SPA wavelength selection and BP-ANN were also implemented by MATLAB®. Some MATLAB® programmed functions were used for hierarchical cluster analysis in order to evaluate the similarity between samples in terms of their ATR-FTIR spectra and to assess the number of characteristic subsets into which the samples could be divided. Similar criteria to those already published for milk and fruit juice classification were used [23]. 2.2. ATR-FTIR analysis of yogurt samples Totally, 59 commercial yogurt samples were obtained from the market, covering a wide range of the available types of yogurt: plain, sugar-added, non-fat, low fat, high fat content, flavored yogurts, yogurt mousse, etc. Reference concentration data for the protein content in the samples were provided by the manufacturers according to the FDA approved standard test methods. The original samples were placed in the same temperaturecontrolled room containing the spectrometer before carrying out the

The main signals in the mid-IR region are in 1800–1500 cm− 1. There is a band at about 1750 cm− 1 which is associated with the C=O stretching of proteins. On the other hand, C=O stretching band of amide I and N–H bending of amide II are both located in this spectral region. 3. Chemometrics methods and results 3.1. Outlier detection Actually, the term “outlier detection” encompasses two steps: first, atypical object detection, followed by outlier identification. Although numerical methods allow flagging of samples that are outliers on statistical grounds, the positive identification of an atypical object as a true outlier requires knowledge of the process or data acquisition procedure, or interaction with the person in charge of this acquisition. The simplest applied tool to flag atypical objects before modeling that is used in this study is the visual observation of the X and Y data available. The original set of sample spectra, the vector of responses and score plots on the first principal components (PC) must be considered. In order to detect outliers in the X space, it is recommended to examine the leverage of each sample to detect possible outliers. The leverage of a sample is a measure of its spatial distance to the main body of the samples in X, for a given data matrix X, the leverage of sample i is given by the diagonal term pii of the “Leverage Matrix” P (Eq. (1)).  −1 P ¼ X X T X XT

ð1Þ

High leverage points have large values of pii (diagonal elements of the P matrix) and special attention should be paid to these points.

Fig. 1. ATR-FTIR spectra of yogurt samples (a) and water (b) with water background.

M. Khanmohammadi et al. / Microchemical Journal 91 (2009) 47–52

49

They have a strong influence on parameter estimation and can alter the model dramatically if they happen to be true outliers. By implementation of this method into the primary samples, 50 samples were selected. Principal Component Analysis for extracting essential scores has been applied to modify results, 6 PCs were selected and new absorbance matrix with these PCs was created, being replaced with X matrix in Eq. (1) (to calculate leverage values), then the results of diagonal element in P matrix, containing leverage values, with a threshold greater than 0.7 were omitted, so 50 samples were selected for analysis (Fig. 2). 3.2. Cluster analysis After outlier detection, an important action is to select appropriate calibration or validation data set with a minimum error in model prediction. A methodology for this procedure is hierarchical cluster analysis (HCA). In HCA, the similarity between samples is established using the concept of a “distance” (calculated using a mathematical relationship; i.e., the Euclidian norm) between samples which are related to how similar the numerical properties of the samples are (e.g. the absorbance at different wavelengths). To perform HCA on a data set, the similarity or dissimilarity between every pair of objects in the data set needs to be determined. First compute the Euclidean distance between pairs of objects in spectroscopic data matrix by Eq. (2): " #  2 1=2 dAB ¼ ∑ x1j −x2j

ð2Þ

j

where dAB is a distance between pairs of object, xi , xj is the value of the jth variable measured on the ith object. Then group the objects into a binary, hierarchical cluster tree using the complete linkage algorithm with mathematical function as below: dkði;jÞ

  ¼ max dk;i ; dk;j

ð3Þ

where dk(i, j) is the distance between group k and a new group (i, j) formed by the fusion of groups i and j. The last step is to determine where to cut the hierarchical tree into clusters and generates a dendrogram plot of the hierarchical, binary cluster tree. It seems to be a very important task, evaluating possible group among the considered samples by a clustering method before ANN data treatment. It would allow the proper selection of representative calibration set, thus improving the prediction ability of unknown

Fig. 3. Dendrographic clustering of yogurt samples based on the Euclidean distance after performing PCA analysis (7 factors) on ATR-FTIR spectra and applying the complete linkage method.

samples. Fig. 3 shows the dendrographic clustering of samples obtained with 6 factors, using complete linkage. Dendrogram shows two separated main groups which seem to be related to the carbohydrate content, and is directly correlated with the intensity of the main ATR-FTIR band. In each group, samples were separated into 5 smaller clusters according to the other parameters such as protein, fat, and calcium contents, as well as carbohydrate content it is noticeable that the dendrogram was built using 1800–900 cm− 1 spectral region. 3.3. Calibration, validation and real samples An important step in the development of any calibration model is the splitting of the available data into two subsets: a training set (used to estimate model parameters) and a validation set or test set (used to check the generalization ability of the model). In case of ANN the problem is more complex because they fit to arbitrary precision the training data, provided so that the number of hidden nodes is sufficient and the training time is long enough. Therefore, an additional monitoring set is necessary to stop the training before the ANN learns idiosyncrasies present in the training data. The monitoring set must be representative of the population under study in order to avoid ANN overtraining that leads to over fitting. In the present work, data collection for models was chosen based on the results from HCA, using the dendrogram of Fig. 3 to select 3 data sets: calibration, monitoring and independent test set. In this sense, the selection criterion was to use at least one sample from each cluster for calibration. In this case, the number of samples placed in the validation set was approximated by the square root of the number of samples in the cluster (unless there was only one sample in the cluster), and rest of the samples were included in the calibration set. So, the new calibration and monitoring set comprised 35 samples, and the remaining 15 samples were used to validate the model. In the previous report [9], the PLS calibration was performed on each of major clusters in the data. The use of a neural network may be necessary because it was deemed desirable to merge these two calibrations into a single calibration. 3.4. Artificial neural networks

Fig. 2. Outlier detection for 50 samples with threshold of 0.7.

3.4.1. Data compression As pointed out earlier, the ratio of the number of samples to the number of adjustable parameters in the ANN should be kept as large as possible. One way of over determining the problem is to compress input data, especially when they consist of absorbance recorded at several hundred wavelengths. In addition to reducing the size of input

50

M. Khanmohammadi et al. / Microchemical Journal 91 (2009) 47–52

data, compression allows one to eliminate irrelevant information such as noise or redundancies present in a data matrix. Principal Component Analysis (PCA) creates “new” dimensions of the data and evaluates a reduced number of independent factors or principal components describing the information included in a system of characteristic but partly dependent variables. The aim of PCA is to find a few components which explain the major variations within the data matrix. Thus PCA was performed to reduce the size of original data matrix containing 35 samples and 79 wavelengths. When compressing data with PCA, one must be aware of some theoretical limitations. In order to find the best latent variable selection, leave one out crossvalidation has been applied for models. According to the results, PCA causes to create data set containing 11 and 8 PCs for ANN and SPAANN models respectively. These selections were according to exceed PCs and optimizing the models. 3.4.2. Neural model ANN is typically organized in layers where these layers are made up by a number of interconnected nodes which contain an activation function. Input vectors are presented to the network via the input layer which communicates to one or more “hidden layers” where the actual processing is done via a system of weighted “connections”. Most ANNs contain some form of “learning rule” which modifies the weights of the connections according to input patterns that it is presented with. There are many different kinds of learning rules used by neural networks, in this work, Back-Propagation Artificial Neural Networks (BP-ANN) was applied. In BP-ANN, “learning” is a supervised process that occurs with each cycle of “epoch” (i.e. each time the network is presented with a new input pattern) through a forward activation flow of inputs and the backwards error propagation of weight adjustment. The simplest implementation of BP learning updates the network weights and biases in the direction in which performance function decreases most rapidly, the negative of the gradient. An iteration of this algorithm can be written as Eq. (4): X kþ1 ¼ X k −δk gk

ð4Þ

where Xk is a vector of current weights and biases, gk is the current gradient, and δk is the learning rate. In this work, gradient descent with momentum was applied and the performance function was the Mean of the Sum of Squares Error (MSSE), the average squared error between the network outputs and the actual output. For the basic gradient descent algorithm, the weights and biases are moved in the direction of the negative gradient of the performance function. Gradient descent with momentum often provides faster convergence because momentum allows a network to respond not only to the local gradient but also to recent trends in the error surface. Momentum can also help the network to overcome a shallow local minimum in the error surface and settle down at or near the global minimum. Momentum can be added to back-propagation learning by making weight changes equal to the sum of a fraction of the last weight change and the new change suggested by the back-propagation rule. The magnitude of the momentum constant is allowed to be any number between 0 and 1. When the momentum constant is 0, the weight change is based solely on the gradient. When the momentum constant is 1, the new weight change is set equal to the last weight change and

Table 1 Selected parameter for BP-ANN and SPA-BP-ANN models

Number of layer Number of node Transfer function Momentum Learning rate Selected PC

ANN

SPA-ANN

2.00 10 input–1 out Tansig–Tansig 0.90 0.80 11.00

2.00 8 input–1 out Tansig–Tansig 0.90 0.80 8.00

Fig. 4. Selected wavelength using SPA for ANN model.

the gradient is simply ignored. The performance of the network was also tested by reducing the dimension of the input vectors before the training process. The selected model for ANN was 10–1 input–output pair model by Tan-sigmoid (Tansig) as a transfer function in each layer. On the other hand selected factors for SPA-ANN model were 8–1 input–output pair containing Tansig as a transfer function. Selected parameters for ANN models are detailed in Table 1. 3.5. Successive Projection Algorithm (SPA) The main purpose of this algorithm is to select wavelengths which their information content is minimally redundant, in order to solve the collinearity problems. The choice of wavelengths for model building using SPA is critical if the model is to have good future predictive ability. In order to generate a robust model, those wavelengths which are truly “causal” must be used, otherwise they may produce good correlations through mere chance from fortuitous noise trends. Traditionally, wavelength selection has involved the selection of all wavelengths in a particular region of the spectrum known to contain useful information. The most appropriate point to start the wavelength selection process is from a small number of wavelengths known to be useful for predicting the responses. SPA is a forward selection method, which starts with one wavelength, and then incorporates a new one at each iteration, until a specified number of wavelengths are reached. The Gram–Schmidt algorithm is used to construct an orthonormal basis for an inner product space. In order to explain the procedure, we suppose that there are linearly independent vectors v1,…,vn which can be converted into a set of orthogonal vectors q1,…qn by the Gram–Schmidt process. Generally, at first u1 = v1, and then each ui is made orthogonal to the preceding u1,…,ui − 1 by subtraction of the projections of vi in the directions of u1,…,ui − 1: i−1

ui ¼ vi − ∑

uTj vi

T j¼1 uj uj

uj

ð5Þ

The i vectors ui span the same subspace as the vi. The vectors qi = ui/||ui|| are orthonormal. In any inner product space, we can choose the basis in which to work. It often greatly simples calculations to work in an orthogonal basis. SPA, on the contrary, does not modify the original data vectors, since projections are used only for selection purposes. Thus, the relation between spectral variables and data vectors is preserved. The selection of causal wavelengths represents a difficult task, particularly with full spectrum multivariate techniques such as PLS.

M. Khanmohammadi et al. / Microchemical Journal 91 (2009) 47–52

51

Table 2 Actual and predicted values of protein content (%) in yogurt samples with ANN and SPAANN algorithm for training set

Table 4 Actual and predicted values of protein content (%) in yogurt samples with ANN and SPAANN algorithm for independent test set

Sample

Actual

Predicted by ANN

Predicted by SPA-ANN

Sample

Actual

ANN

SPA-ANN

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 RMSEa REPb R2

3.1 3.2 4.0 4.0 3.1 4.2 4.0 4.6 4.5 3.8 4.5 3.5 3.9 4.1 3.5 3.5 3.8 3.9 3.5 4.5 4.0 4.1 3.9 4.1 4.4 4.3 3.3 3.4 3.6 3.5

3.7 3.6 4.0 3.7 3.6 3.4 3.9 4.1 4.4 4.0 4.4 3.7 4.0 4.1 3.4 3.6 3.8 3.8 3.7 4.3 4.0 4.0 4.3 4.1 4.3 4.2 3.5 3.6 3.5 3.6 0.28 7.25 0.77

3.1 3.3 3.9 3.8 3.5 3.7 4.0 4.6 4.5 3.9 4.5 3.6 3.9 4.1 3.5 3.5 3.7 4.0 3.5 4.4 4.0 4.1 3.9 4.1 4.5 4.2 3.4 3.6 3.6 3.5 0.14 3.70 0.95

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 15 RMSE R2

3.9 4.3 3.5 3.9 4.2 3.2 3.4 4.1 4.6 4.4 3.8 3.1 3.2 4.5 4.5 3.6

3.6 4.0 3.9 3.8 3.5 3.3 3.7 4.4 4.4 4.1 3.6 3.4 3.5 4.7 4.7 3.8 9.32 0.70

3.6 4.1 3.8 3.9 3.7 3.3 3.6 4.3 4.5 4.3 3.7 3.2 3.5 4.6 4.6 3.7 6.98 0.86

a b

Root Mean Square Error. Relative Error of Prediction.

Recently a number of selection methods have been proposed where wavelengths are selected through interactive trial and error processes. The most appropriate point to start the wavelength selection process is from a small number of wavelengths known to be useful for predicting the y variable in question. To obtain these wavelengths a forward stepwise procedure was used to select useful wavelengths. This method starts with a small number of wavelengths. PLS model has been formed and leave one out cross-validation was used to determine the best number of factor which is optimum for model in each iteration; then Root Mean Square Error of prediction (RMSEP) was calculated, being compared with RMSEP of validation set, to recognize the selected variables are appropriate to contribute in model. Subsequent wavelengths are only added to or removed from the X block if the prediction ability of the model would be improved. After finding the best subset instead of full spectrum, the selected wavelengths have been applied in ANN model to build the prediction model. Applying this strategy, 58 wavelengths were selected which are shown in Fig. 4. The training set (30 × 79) was used to develop the BPANN and data set (30 × 58) for SPA-BP-ANN models which were in turn

used to predict the yogurt samples’ protein content (Table 2). Five samples were used for the monitoring set (Table 3) and 15 samples for validation set (test set). It is considerable that the value of MSSE was modified in SPA-BP-ANN model. There is no reason why results obtained on a monitoring set could not be reported, as long as it is made clear that these results were obtained on the data set used to evaluate the training end-point. Interestingly, results of monitoring set show an improvement in error of SPA-ANN in comparison with ANN model. It is mentionable that Root Mean Square Error (RMSE) of monitoring set was 0.37 and 0.21 for ANN and SPA-ANN models respectively. One must be aware of the limitations of this approach: a true validation error (test error) is a better estimator of the ANN generalization ability than a monitoring error. If the modeling power of the ANN would be favored by using only two subsets (training and monitoring) instead of three subsets of smaller size (training, monitoring and validation), excellent results may be obtained on the monitoring set but the model would not be truly validated in the sense that the monitoring data were used to optimize one of the model parameters (number of iterations for training). In this way 2 errors are reported for monitoring and test set which verify better prediction ability of SPA-BP-ANN model. The predicted protein content of yogurt samples were then compared with the actual ones for each of the calibration samples. Finally, these 2 models were applied on 15 test samples in order to predict the protein content. Table 4 shows the predicted results of both models. As mentioned before, application of ANN is a further investigating idea in order to improve the data obtained from classic PLS treating of FTIR data. It was previously reported that PLS–ATR-FTIR analysis of yogurt samples is not very robust. Therefore, the technique requires further development. The

Table 3 Actual and predicted values of protein content (%) in yogurt samples with ANN and SPAANN algorithm for monitoring set Sample

Actual

ANN

SPA-ANN

1 2 3 4 5 RMSE REP R2

4.2 3.9 3.8 4.5 3.3

3.8 3.6 4.1 4.6 3.7 0.37 9.28 0.64

3.9 3.7 3.9 4.5 3.5 0.21 5.38 0.89

Fig. 5. Box plot comparison of the results of both applied model based on F statistic, (A) ANN and (B) SPA-ANN model.

52

M. Khanmohammadi et al. / Microchemical Journal 91 (2009) 47–52

results obtained from PLS show REP to be 7.00 for calibration model while it is 3.70 for SPA-ANN model. A relationship between the spectral absorbance and the dependent variable would not benefit from a non-linear calibration since PLS, which is a linear method, performed as well. It is concluded that feature selection improves the quality of a calibration. The results obtained from ANN show REP to be 7.25 for calibration model while it is 3.70 for SPA-ANN model. It is also needed to check that the training error is reasonably low at the number of retained iterations, and that the representatively between the training and the monitoring set is ensured. Comparing the results of predicted data set with actual concentration of commercial samples base on F’ statistic shows that the SPA-ANN model could get better capability for analysis in this case (Box plot in Fig. 5). 4. Conclusion ATR-FTIR technique enables to investigate yogurt samples while the effect of water interferences is reduced. Cluster analysis is a helpful tool, because of sample heterogeneity. Results show the SPA-ANN model to result in acceptable data, processing the ATR-FTIR data. This new model is possible to be named as a modification for PLS–ATR-FTIR analysis of yogurt samples. References [1] CFSAN 2005 program priorities. Center for Food Safety & Applied Nutrition, US Food and Drug Administration (FDA), College Park, MD, 2005. [2] M.M. Paradkar, J. Irudayaraj, Determination of cholesterol in dairy products using infrared techniques: 1. FTIR spectroscopy, Int. J. Dairy. Technol. 55 (2002) 127. [3] I. Correia, A. Nunes, I.F. Duarte, A. Barros, I. Delgadillo, Sorghum fermentation followed by spectroscopic techniques, Food. Chem. 90 (2005) 853. [4] D.M. Barbano, J.M. Lynch, J.R. Fleming, Direct and indirect determination of true protein content of milk by Kjeldahl analysis: collaborative study, J. Assoc. Off. Anal. Chem. 74 (1991) 281. [5] O.H. Lowry, N.J. Rosebrough, A.L. Farr, R.J. Randall, Protein measurement with the folin phenol reagent, J. Biol. Chem. 193 (1951) 265. [6] M.C. Garcia, M. Marina, M. Torre, Determination of the soybean protein content in soybean liquid milks by Reversed-Phase HPLC, J. Liq. Chromatogr. Relat. Technol. 23 (2000) 3165. [7] I. Recio, C. Olieman, Determination of denatured serum proteins in the casein fraction of heat-treated milk by capillary zone electrophoresis, Electrophoresis 17 (1996) 1228.

[8] B.J. Reeves, D.R. Delwiche, Determination of protein in ground wheat samples by Mid-Infrared Diffuse Reflectance Spectroscopy, Appl. Spectrosc. 51 (1997) 1200. [9] J. Moros, F.A. Inon, M. Khanmohammadi, S. Garrigues, M. de la Guardia, Evaluation of the application of attenuated total reflectance–Fourier transform infrared spectrometry (ATR-FTIR) and chemometrics to the determination of nutritional parameters of yogurt samples, Anal. Bioanal. Chem. 385 (2006) 708. [10] W. Reichardt, W. Eckert, The determination of protein content of milk, cheese, and meat with the use of the biuret reaction, Nahrung 35 (1991) 731. [11] Q.Q. Luthi-Peng, Z. Puhan, The 4th derivative UV spectroscopic method for the rapid determination of protein and casein in milk, Milchwissenschaft 54 (1999) 74. [12] E. Albanell, G. Caja, X. Such, M. Rovai, A.A.K. Salama, R. Casalsb, Determination of fat, protein, casein, total solids, and somatic cell count in goat's milk by NearInfrared Reflectance Spectroscopy, J. AOAC Int. 86 (2003) 746. [13] S. Wold, N. Kettaneh-Wold, B. Skagerberg, Nonlinear PLS modeling, Chemom. Intell. Lab. Syst. 7 (1989) 53. [14] S. Wold, Nonlinear partial least squares modelling II. Spline inner relation, Chemom. Intell. Lab. Syst. 14 (1992) 71. [15] F. Despagne, D.L. Massart, Neural networks in multivariate calibration, Analyst 123 (1998) 157. [16] R.M. Carvalho, C. Mello, L.T. Kubota, Simultaneous determination of phenol isomers in binary mixtures by differential pulse voltammetry using carbon fiber electrode and neural network with pruning as a multivariate calibration tool, Anal. Chim. Acta. 420 (2000) 109. [17] Q. Li, X. Yao, X. Chen, M. Liu, R. Zhang, X. Zhang, Z. Hu, Application of artificial neural networks for the simultaneous determination of a mixture of fluorescent dyes by synchronous fluorescence, Analyst 125 (2000) 2049. [18] J.A. Hageman, M. Streppel, R. Wehrensand, L.M.C. Buydens, Wavelength selection with Tabu Search, J. Chemomet. 17 (2003) 427. [19] H.C. Goicoechea, A.C. Oliviera, Wavelength selection by net analyte signals calculated with multivariate factor-based hybrid linear analysis (HLA). A theoretical and experimental comparison with partial least-squares (PLS), Analyst 124 (1999) 725. [20] M. Cesar, T. Cristina, R. Kawakami, T. Yoneyama, The successive projections algorithm for variable selection in spectroscopic multicomponent analysis, Chemom. Intell. Lab. Syst. 57 (2001) 65. [21] H. Ghasemi, A. Niazi, R. Leardi, Genetic-algorithm-based wavelength selection in multicomponent spectrophotometric determination by PLS: application on copper and zinc mixture, Talanta 59 (2003) 311. [22] M. Khanmohammadi, M.A. Karimi, K. Ghasemi, M. Jabbari, A. Bagheri Garmarudi, Quantitative determination of Malathion in pesticide by modified attenuated total reflectance-Fourier transform infrared spectrometry applying genetic algorithm wavelength selection method, Talanta 72 (2007) 620. [23] J. Moros, F.A. Iñón, S. Garrigues, M. de la Guardia, Determination of the energetic value of fruit and milk-based beverages through partial-least-squares attenuated total reflectance-Fourier transform infrared spectrometry, Anal. Chim. Acta. 538 (2005) 181.