Food Research International 93 (2017) 66–74
Contents lists available at ScienceDirect
Food Research International journal homepage: www.elsevier.com/locate/foodres
Bioinactivation: Software for modelling dynamic microbial inactivation Alberto Garre a, Pablo S. Fernández a,⁎, Roland Lindqvist b, Jose A. Egea c a Departamento de Ingeniería de Alimentos y del Equipamiento Agrícola, Instituto de Biotecnología Vegetal, Universidad Politécnica de Cartagena (ETSIA), Paseo Alfonso XIII, 48, 30203 Cartagena, Spain b National Food Agency, Box 622, SE 751 26 Uppsala, Sweden c Departamento de Matemática Aplicada y Estadística, Universidad Politécnica de Cartagena, Antiguo Hospital de Marina (ETSII), Av. Dr. Fleming S/N, 30202 Cartagena, Spain
a r t i c l e
i n f o
Article history: Received 24 November 2016 Received in revised form 12 January 2017 Accepted 15 January 2017 Available online 20 January 2017 Keywords: Predictive microbiology Non-isothermal microbial inactivation Model fitting Freeware tool
a b s t r a c t This contribution presents the bioinactivation software, which implements functions for the modelling of isothermal and non-isothermal microbial inactivation. This software offers features such as user-friendliness, modelling of dynamic conditions, possibility to choose the fitting algorithm and generation of prediction intervals. The software is offered in two different formats: Bioinactivation core and Bioinactivation SE. Bioinactivation core is a package for the R programming language, which includes features for the generation of predictions and for the fitting of models to inactivation experiments using non-linear regression or a Markov Chain Monte Carlo algorithm (MCMC). The calculations are based on inactivation models common in academia and industry (Bigelow, Peleg, Mafart and Geeraerd). Bioinactivation SE supplies a user-friendly interface to selected functions of Bioinactivation core, namely the model fitting of non-isothermal experiments and the generation of prediction intervals. The capabilities of bioinactivation are presented in this paper through a case study, modelling the non-isothermal inactivation of Bacillus sporothermodurans. This study has provided a full characterization of the response of the bacteria to dynamic temperature conditions, including confidence intervals for the model parameters and a prediction interval of the survivor curve. We conclude that the MCMC algorithm produces a better characterization of the biological uncertainty and variability than non-linear regression. The bioinactivation software can be relevant to the food and pharmaceutical industry, as well as to regulatory agencies, as part of a (quantitative) microbial risk assessment. © 2017 Published by Elsevier Ltd.
1. Introduction The microbial safety of food products is commonly ensured by the application of heat treatments which reduce the microbial hazards to safe levels. However, the application of high temperatures also reduces the sensorial properties of the product, lowering its quality. As shown in the work by Van Zuijlen et al. (2010), food industry usually overprocesses products causing an unnecessary reduction of quality. This is in contradiction with modern consumer demands for minimally processed products (fresh-like), which can only be produced through optimized heat treatments. For that purpose, a detailed description of the response of the microbial population to the heat treatments applied during processing is needed. Predictive microbiology serves a key role in this matter, because it aims to mathematically describe the behaviour of a microbial population under certain environmental conditions. Therefore, its results can be used for the description of the response of a microbial population to the heat treatments applied in industry. The prediction of the microbial reduction is based on mathematical models whose parameters ⁎ Corresponding author. E-mail address:
[email protected] (P.S. Fernández).
http://dx.doi.org/10.1016/j.foodres.2017.01.012 0963-9969/© 2017 Published by Elsevier Ltd.
must be estimated by fitting models to data from inactivation experiments and validated in the foods of interest (see, for instance, Dolan, 2003, Dolan, Valdramidis, & Mishra, 2013, or Cattani et al., 2016). However, the inherent variability within the microbial population impedes the description of its response using discrete curves (Aspridou & Koutsoumanis, 2015). Instead, characterization must be made using confidence intervals describing the probability distribution of the microbial count at a certain time. This kind of description enables the development of a quantitative microbial risk assessment (Haas, Rose, & Gerba, 2014). Classically, the limitations in both experimental equipment and calculation methods only allowed the characterization of the heat resistance of the microbial population in a two-step procedure using isothermal experiments (Stumbo, 1973). For historical reasons, mathematical models for the description of these experiments were categorized into primary, secondary and tertiary models (Whiting, 1995; Perez-Rodriguez & Valero, 2012). Primary models describe the time response of the microbial count under constant environmental conditions. The variations in the model parameters of the primary model with respect to environmental changes (temperature, pH, water activity, etc.) are described by the secondary models. These conditions are usually kept constant during each repetition of the experiments, keeping
A. Garre et al. / Food Research International 93 (2017) 66–74
primary and secondary models completely independent. Moreover, the constant conditions allow the description of primary and secondary models as algebraic equations. Finally, tertiary models are software which bundle primary and secondary models in a user friendly interface. In recent years, equipment has been developed which allows the performance of non-isothermal experiments, e.g. the thermoresistometer Mastia (Conesa, Andreu, Fernández, Esnoz, & Palop, 2009) or the heat exchanger by Huertas et al. (2016). They permit a more accurate reproduction of the conditions occurring in industrial treatments. However, the mathematical description of these conditions usually requires writing the inactivation model in differential form. Therefore, primary and secondary models cannot be fitted independently. Furthermore, the model parameters estimated using isothermal experiments are usually unable to predict the response of the microbial population under dynamic conditions. Analyses like the ones by Leontidis et al. (1999) and Janssen et al. (2008) have shown that non-isothermal inactivation of a microbial population can only be accurately predicted using models fitted to non-isothermal experiments. Muñoz-Cuevas, Metris, and Baranyi (2012) highlight the need for software tools that can integrate databases and microbial process models in Monte-Carlo simulations. Tenenhaus-Aziza and Ellouze (2015) reviewed many of the software packages currently available for predictive microbiology. Several tools are available which are able to fit primary models to data obtained from isothermal experiments. The algebraic solution of the isothermal inactivation model can be used to describe some non-isothermal inactivation profiles following the procedure used in e.g. Hassani, Condon, and Pagan (2007). Nevertheless, this approach has two major limitations: (i) it is only applicable to a handful of temperature profiles with sigmoidal inactivation curves; and (ii) the model parameters estimated are hard to extrapolate to other situations. For instance, the δ-values for the Mafart model (time required for the first log-reduction) estimated from the dynamic experiments are applicable to the temperature profile studied, but are hard to extrapolate to other profiles even if the microorganism, heating media and other environmental conditions are kept constant. Two well-known software tools for modelling microbial inactivation including dynamic conditions are Sym'Previus (www.symprevius.org) and Combase (http://www.combase.cc). Sym'Previus includes modules for model fitting of non-isothermal experiments and for the generation of dynamic predictions. Nevertheless, it is a commercial software, thus, users cannot access the code of the application. Combase is a freeware tool based on a broad database which includes values for the parameters of some popular inactivation models, including their uncertainty. Based on this data, it is possible to generate prediction intervals for isothermal experiments and dynamic inactivation curves. This paper presents the freeware bioinactivation, which implements different functions for the modelling of microbial inactivation. This tool is specially designed for the characterization of non-isothermal experiments including the uncertainty of the model parameters, and for the generation of prediction intervals of the inactivation curve. It is divided in two applications: bioinactivation core, a package for the R programming language (R Core Team, 2015), and bioinactivation SE (Simplified Environment), a user interface for selected features. Mathematical foundations of the problem addressed, dynamic models and capabilities of the software are presented through a case study, where the thermal resistance of B. sporothermodurans is determined under dynamic conditions. 2. Materials and methods 2.1. Mathematical framework The mathematical problem addressed here is the parameter estimation (or model calibration) of dynamic processes that can be modelled
67
as systems of differential-algebraic equations (DAEs) which has the following general form: (
A
dxðt; θÞ ¼ f ðt; xðt; θÞ; θ; uðt ÞÞ; dt xðt 0 ; θÞ ¼ x0 ðθÞ
t 0 btbt f
ð1Þ
where t is the time; the m-dimensional vector, θ, contains the m unknown model parameters, x is an n-dimensional vector of state variables (e.g., microbial count), u are the externally input signals (e.g., the time-varying temperature in an inactivation process), and f is a given vector function (e.g., the inactivation model). When components of the initial state vector x0 are also unknown, they can be included in θ (e.g., the initial microbial count in an inactivation process), thus x0 may depend on θ. In the simplest case, A is a constant diagonal n × n matrix with Aii = 1 or 0 if the i-th equation is a differential or algebraic equation respectively. In the models presented in Section 2.2, the primary models are differential while the secondary models are algebraic. Further equations can be added to the formulation above, like for example a vector of observables, which are the states variables (or a combination of some of them) that can be measured during the experiments. gðt; xðt; θÞ; θ; uðt ÞÞ
ð2Þ
Finally, a vector of constraint functions can be defined as the boundary values of model variables within specified intervals. cðt; xðt; θÞ; θ; uðt ÞÞ≥0
ð3Þ
For estimating the model parameters, let us assume that N measurements are available to find parameters of the system defined by Eqs. (1)–(3). Each measurement, yi, is specified by the time ti when the ith component of the observable vector g is measured. The correspond^ obtained solving the ing model value for a specific parameter vector θ, system (1) and computing the observable function (2), is denoted by g^i ^ θ; ^ uðt i ÞÞ. The vector of discrepancies between the model ¼ g ðt i ; xðt i ; θÞ; i
(theoretical) and the experimental values (commonly called residuals) is given by ^ ¼ g t; x t; θ ^ ; θ; ^ uðt Þ −y: e θ
ð4Þ
Given this description, the parameter estimation problem can be formulated as an m-dimensional optimization problem to minimize some measure, V(θ), for the discrepancy e(θ). The most used measure for the discrepancy is the generalized least-squares cost function N
min ∑ eðθÞT WeðθÞ θ
i¼1
ð5Þ
where W is a scaling matrix that balances the residuals. To find the minimum of the cost function, optimization methods are used. Different approaches can be used for this purpose. In Section 2.4 we will briefly describe two of them implemented in bioinactivation, namely non-linear regression and Monte Carlo Markov Chains. The solution method used for the initial value problem described in Eq. (1) and inherent to the optimization problem will also be mentioned. 2.2. Inactivation models Bioinactivation includes four microbial inactivation models commonly used in academia and industry. We will refer to them as Bigelow, Peleg, Mafart, and Geeraerd, as referenced in Table 1. The primary models used for Bigelow and Geeraerd model are log-linear (although Geeraerd model includes shoulder and tail effects), whereas the ones used for Peleg and Mafart models are based on the Weibull distribution.
68
A. Garre et al. / Food Research International 93 (2017) 66–74
Table 1 Summary of the inactivation models included in bioinactivation. Model name
Primary model
Secondary model
Number of parametersa
Number of variables
Bigelow Peleg Mafart Geeraerd
Log-linear inactivation Weibull-based (reparametrized) Mafart et al. (2002) Geeraerd et al. (2009), log-linear including shoulder and tails
Bigelow (1921) Peleg and Cole (1998) Bigelow like Bigelow (1921)
2 3 3 3
1 1 1 2
a
Reference temperature not considered as model parameter.
2.2.1. Log-linear inactivation models The log-linear primary inactivation model is shown in Eq. (6). According to this model, the microbial count, N, decreases exponentially with time, t. The time required for one logarithmic reduction under constant temperature, T, is defined by the D-value, D(T). dlog10 N 1 ¼− dt DðT Þ
ð6Þ
Bigelow (1921) concluded that the logarithm of the D-value varies linearly with temperature, as shown in Eq. (7). The sensitivity of the D-value to temperature changes is quantified through the z-value, z, which provides the temperature increment required for a ten-fold change of the D-value. Tref is a reference temperature defined by the user according to the target microorganisms for each type of foods (that can be linked to performance and process criteria). T−T ref log10 DðT Þ ¼ log10 D T ref − z
ð7Þ
The combination of Eqs. (6) and (7) results in Eq. (8). This model is labelled Bigelow model in bioinactivation. It has 3 model parameters: D(Tref), Tref and z, and 1 variable: N. Nevertheless, the reference temperature is usually fixed by the user, reducing the number of model parameters by one. dlog10 N ¼− dt D T
1 −ðT−T ref Þ=z 10 ref
ð8Þ
The model described by Geeraerd, Herremans, and Van Impe (2009) modifies Eq. (6) to account for shoulder and tail effects, as shown in Eq. (9). dN 1 Nres kmax ðT Þ N 1− ¼− N dt 1 þ Cc
ð9Þ
The description of the shoulder effect is based on the inclusion of a hypothetical substance Cc, which decays with rate kmax(T). The tail height is defined by the parameter Nres. The relationship between kmax and temperature is described using an equation similar to the one used in the Bigelow model (Eq. (10)). kmax ðT Þ ¼ kmax T ref 10−ðT−T ref Þ=z
ð10Þ
This model has 4 model parameters (kmax(Tref), Tref, z and Nres) and 2 variables (N and Cc). The reference temperature is usually set by the user, reducing the number of parameters to 3. The combination of Eqs. (9) and (10) constructs the Geeraerd model. Both Bioinactivation core and SE use the D(T) parameter instead of kmax(T) for the input, as well as for the output. This eases the comparison between results obtained using Geeraerd and Bigelow models. The relationship between both parameters is shown in Eq. (11). DðT Þ ¼
ln 10 kmax ðT Þ
ð11Þ
2.2.2. Weibullian inactivation models Weibullian inactivation models are based on the hypothesis that microbial inactivation can be described as a failure phenomenon. Therefore, the time required to inactivate an individual cell follows a Weibull distribution. This hypothesis, written in differential form as in Mafart, Courvert, Gaillard, and Leguerinel (2002), is described in Eq. (12), where S is the survival ratio (N/N0). δ(T) and p are, respectively, the scale and shape parameters of the Weibull distribution. δ(T) can be interpreted as the time required for the first log-reduction under constant temperature conditions. On the other hand, p includes the curvature of the survivor curve under isothermal conditions. dlog10 S 1 p p−1 t ¼ −p dt δðT Þ
ð12Þ
Mafart et al. (2002) used a secondary model for δ(T) similar to the one devised by Bigelow for the D-value. The resulting differential equation is provided in Eq. (13), which defines the model labelled Mafart in bioinactivation. In this model, the shape parameter, p, is considered temperature independent. dlog10 S 1 ¼ −p dt δ T ref 10−ðT−T ref Þ=z
!p t p−1
ð13Þ
The Mafart model has 4 model parameters (p, δ(Tref), z and Tref) and 1 variable (N). The reference temperature is usually known by the user, reducing the number of model parameters to estimate to 3. Peleg and Cole (1998) considered a different parameterization of the Weibull distribution, based on two different parameters, b(T) and n, as shown in Eq. (14). Nonetheless, this equation is equivalent to Eq. (12). n−1 dlog10 S −log10 S n ¼ −bðT Þ n dt bðT Þ
ð14Þ
However, they use a different secondary model to describe the relationship between b(T) and temperature. It is described by Eq. (15), where kb is a model parameter and Tc is a critical temperature below which no inactivation occurs. bðT Þ ¼ ln 1 þ ekb ðT−T c Þ
ð15Þ
The Peleg model, shown in Eq. (16), is the result of combining Eqs. (14) and (15). This model has 3 model parameters (k, Tc and n) and 1 variable (N). dlog10 S ¼ − ln 1 þ ekb ðT−T c Þ n dt
−log10 S ln 1 þ ekb ðT−T c Þ
!n−1 n ð16Þ
2.3. Software developed: bioinactivation core and bioinactivation SE The bioinactivation software offers advanced mathematical tools for the modelling of isothermal and non-isothermal microbial inactivation processes. It is presented in two different formats: an R package
A. Garre et al. / Food Research International 93 (2017) 66–74
including all the functions used for the mathematical modelling (bioinactivation core) and a user-friendly interface to selected features (bioinactivation SE). Bioinactivation core comprises the functions written in the R language for modelling microbial inactivation. The features implemented allow the prediction of the survivor curves for isothermal or non-isothermal inactivation processes. Moreover, the four inactivation models described in Section 2.2 are implemented. The software is able to perform the fitting using non-linear regression (nlr) or a Markov Chain Monte Carlo algorithm (MCMC). Based on the results of the fitting, the package is able to generate prediction intervals for isothermal or nonisothermal inactivation processes. The code in bioinactivation core has been bundled as an R package (Garre, Fernández, & Egea, 2015) which has been uploaded to the Comprehensive R Archive Network (CRAN), making it publicly available. Being part of a general purpose programming language, bioinactivation core can be easily be extended with new features or implemented in other tools (e.g. risk analysis or optimization). Bioinactivation SE is a so called tertiary model. It wraps primary and secondary models in a user-friendly interface developed using the shiny package of R (Chang, Cheng, Allaire, Wie, & McPherson, 2015). This package implements functions for building a web application able to run R code in the background. Therefore, users do not need to have installed R in their computer; the only requirements are an internet connection and a web browser. Bioinactivation SE is currently hosted in https://opada-upct.shinyapps.io/bioinactivation_SE. A permanent link to the webpage hosting the application can be found in the site of the Institute of Plant Biotechnology of the Technical University of Cartagena (http://www.upct.es/ibvupct/microbiologia.php), together with the last version of the user manual (the current version is supplied as Supplementary material to this publication) and various learning materials. For a thorough description of Bioinactivation core, we refer to the vignette of the bioinactivation package which can be accessed from its CRAN page (https://cran.r-project.org/web/packages/bioinactivation/ vignettes/inactivation.html). In the remaining of this section, the main features of Bioinactivation SE are briefly presented. For a detailed description of its interface, refer to its user manual. The layout of Bioinactivation SE is illustrated in Fig. 1. With the intention to guide the natural workflow of the model fitting, the two main actions have been divided by the navigation bar on top of the webpage: the data input and the model fitting. Each one is divided in a left panel for data input and a main panel for result output. Moreover, an about page with miscellaneous information is included. The experimental data is input to Bioinactivation SE as plain text files (UTF-8 encoding is recommended). The file uploaded must contain the elapsed time of the measurements, the temperature recorded and the microbial count measured. Once imported, Bioinactivation SE plots the data, allowing the user to check it. An example input file is included as part of the Supplementary material of this article. The solution of the system of algebraic and differential equations describing the inactivation models are solved numerically using the deSolve package for R (Soetaert, Petzold, & Setzer, 2010), which implements several methods for solving differential equations numerically. A brief description of the numerical algorithm is presented in Section 2.4.1. The model fitting can be performed using two different fitting algorithms: non-linear regression (nlr) and a Markov Chain Monte Carlo algorithm (MCMC). Both fitting algorithms have been imported from the FME package by Soetaert and Petzoldt (2010) (functions modFit and modMCMC). Sections 2.4.2 and 2.4.3 present a brief description of these algorithms. Bioinactivation SE allows the user to select which model parameters to fit to the experimental data and which ones are assigned to fixed values, including the initial values of the model variables. It produces three types of outputs. The first one is a plot comparing the experimental results and the fitted curve. This plot provides a
69
first impression of the goodness of the fit and may raise the need for modifying initial estimates for the model parameters, its bounds or the settings of the fitting algorithm. The second output is a set of tables containing the statistical summary of the model fitting. The first table includes the estimated value and standard deviation of the model parameters fitted. In addition, a confidence interval at the 95% level is provided for each one. The second table contains basic information regarding the residuals of the model fitting, namely the Sum Squared Errors (SSE), Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). The third summary table shows the correlation estimated between the fitted model parameters. The user manual of Bioinactivation SE (included in the Supplementary material) contains examples of these tables. Based on the estimates of the model parameters and their probability distributions, bioinactivation SE builds a prediction interval of the response of the microorganism to the heat treatment applied. Furthermore, Bioinactivation SE allows the user to download all the results generated as text files for further post-processing using external software. 2.4. Numerical methods 2.4.1. Generation of predictions: solution of the differential equations The generation of predictions using the inactivation models presented in Section 2.2 requires the solution of a system of algebraic and differential equations. The solution of this system can be written in closed form only for simple inactivation profiles. Therefore, the solution for industrial profiles, with complex forms and measurement errors, needs to be calculated using numerical methods. Bioinactivation core allows the user to choose the solver among the ones implemented in the deSolve package. On the other hand, Bioinactivation SE uses the LSODA solver (Hindmarsh, 1983) with the default options. In the tests performed, this algorithm has been able to successfully calculate a solution. 2.4.2. Model fitting: non-linear regression Bioinactivation includes a non-linear regression algorithm for the model fitting, namely the implementation of the Levenberg-Marquardt method presented in Moré (1978) and included in the FME package. The optimization scheme of this algorithm is a gradient descent method, which is prone to converge to a local minimum. It is advised to check the validity of the model fitted using the visualization options included in both versions of Bioinactivation. Both Bioinactivation core and Bioinactivation SE allow the definition of lower and upper bounds for the model parameters. Moreover, Bioinactivation core allows the user to choose other optimization algorithms as well as modifying their options The estimation of confidence intervals for the model parameters is based on the covariance matrix. It is estimated as cov = 1/(0.5 H), where H is the Hessian matrix. Confidence intervals (CI) are constructed assuming that the model parameters follow a t-distribution with n – p degrees of freedom, where n is the total number of observations and p the number of parameters fitted. Hence, the confidence interval for ^ θ is the esti^ θ, where σ model parameter θ is given by CI ¼ ^θ t 1−α;n−p σ 2
mated standard deviation of parameter θ. 2.4.3. Model fitting: MCMC fitting Bioinactivation wraps the FME package for R, which includes an implementation of the adaptive MCMC fitting algorithm described in Haario, Laine, Mira, and Saksman (2006). This fitting algorithm combines an adaptive Metropolis sampler and delayed rejection. The mathematical description of this algorithm is complex and out of the scope of this article. Here, we just focus on its practical implementation to the problem at hand. A more thorough description can be found in Haario et al. (2006).
70 A. Garre et al. / Food Research International 93 (2017) 66–74
Fig. 1. Overview of the model fitting section Bioinactivation SE.
A. Garre et al. / Food Research International 93 (2017) 66–74
The MCMC algorithm used is based on Bayesian statistics. As such, the user has to define prior-distributions of the model parameters to fit. In Bioinactivation SE, non-informative priors (i.e. uniform probability distributions) are used, limited by the bounds defined by the user. Bioinactivation core provides extra flexibility, allowing the user to define the prior-distributions. This algorithm does not include a stopping criteria based on the convergence of the fit. Instead, the user decides the number of iterations of the Markov chain. This kind of algorithm usually generates spurious results during its first iterations. Omitting these draws can speed up the convergence of the model calibration. The number of iterations omitted is called the burn-in length. This parameter can be set by the user in both versions of Bioinactivation. The iterations of the MCMC algorithm generate a sample of possible values of the model parameters. This set of values is an estimation of the probability distribution of the model parameters fitted (posterior distribution). In this sense, the MCMC algorithm is substantially different to non-linear regression, where it is assumed that the model parameters are normally distributed. Instead, MCMC estimates the probability distribution based on the data provided. This procedure is superior for describing the uncertainty of the model parameters, especially for cases where multimodality is present or the posterior distribution is not symmetrical. The covariance matrix is estimated as the covariance calculated between the model parameters obtained in the iterations of the MCMC algorithm. Confidence intervals are calculated from the quantiles of the posterior-distribution.
2.4.4. Generation of prediction intervals for the inactivation curve Bioinactivation is able to generate prediction intervals of the inactivation curve based on the estimation of the probability distribution of the model parameters fitted to the data. The prediction interval is calculated using the bootstrap method (Efron & Tibshirani, 1993), commonly used for the estimation of several statistics. This method can be summarized in the following steps: (1) a sample of feasible model parameters is generated based on the results of the model fit, (2) the survivor curve is simulated for each set of model parameters sampled and (3) the prediction band is calculated based on the quantiles of the simulations. The method used for the sampling of model parameters depends on the fitting algorithm used. For non-linear regression, it is assumed that the model parameters follow a multivariate normal distribution. For MCMC, the draws of the individual iterations of the algorithm are sampled with replacement.
71
The survivor curve is predicted using each set of model parameters sampled. This generates an estimated distribution of the microbial count at discrete values of time. Bioinactivation SE generates the prediction interval using 100 discrete time points and 100 simulations. Bioinactivation core allows the user to choose the value of both variables. The prediction interval is estimated as the quantiles of the distribution of the microbial count at each discrete time point. Both versions of bioinactivation allow the user choosing the confidence levels at which the prediction interval is calculated. 2.5. Case study The features implemented in bioinactivation SE are presented in this paper using experimental inactivation data obtained with the following procedure: known concentrations of Bacillus sporothermodurans IC4 spores were inoculated in sterile bidistilled water which had been previously heated to the target temperatures in a thermoresistometer Mastia (Conesa et al., 2009). They were then exposed to a non-isothermal temperature profile (Fig. 2). Samples were taken at predefined time intervals and the total count of microorganisms surviving the heat treatment was determined using standard microbiological procedures, such as dilutions and plate counts in nutrient agar. The Mastia thermoresistometer records the temperature profile in the vessel during the complete process. This procedure was followed to generate a set of non-isothermal experiments, consisting of two repetitions with ten observations each. The experimental results are included in bioinactivation core. The four models implemented in bioinactivation (Bigelow, Peleg, Mafart and Geeraerd) were fitted to the experimental data presented in Section 3 using both fitting algorithms implemented in the package (nlr and MCMC). Due to the fact that the data set studied does not contain shoulder or tail effects, Geeraerd and Bigelow models are equivalent. Consequently, its results are not shown here. Therefore, a total of six model fittings were performed. Realistic initial estimates and bounds for the model parameters were determined using preliminary tests. Table 2 provides the values set for the initial estimates and bounds of the model parameters for each calculation. The reference temperature was set to 120 °C on every case where it was present in the model (Bigelow, Mafart and Geeraerd). The initial microbial count was considered unknown and, thus, was fitted to the experimental data on every single calculation. The model fittings using the MCMC algorithm were performed several times varying its parameters to assess its convergence. The number of iterations required for convergence varied depending on the
Fig. 2. Non-isothermal inactivation of B. sporothermodurans on nutrient agar. The solid line illustrates the temperature profile used. The squares represent the microbial counts observed.
72
A. Garre et al. / Food Research International 93 (2017) 66–74
Table 2 Initial estimates and bounds of the parameters of each model considered for the fitting.
Bigelow
Mafart
Peleg
a
Parameter
Initial estimate
Lower bound
Upper bound
Dref (min) z (°C) Tref (°C) log10N0 (CFU/ml) δref (min) p z (°C) Tref (°C) log10N0 (CFU/ml) kb (1/°C) n Tc (°C) log10N0
10 8 120 6 10 1 10 120 6 0.25 0.9 130 6
1 4
20 12
a
a
5 1 0.5 8
7 20 2.5 15
a
a
5 0.01 0.8 110 5
7 1 1.1 150 7
Table 3 Statistical information of the fitting provided by the tool: estimates of the model parameters and their variances, as well as lower and upper limits of a confidence interval at the 95% confidence level. Moreover, the RMSE of the fitted curve is supplied. Algorithm Parameter Bigelow nlr
MCMC
Mafart
nlr
The model parameter is fixed (i.e. not fitted). MCMC
inactivation model. It was determined that a minimum of 4000 iterations was required with a burn-in length of 0 iterations.
3. Results and discussion Fig. 3 plots the curve fitted to the experimental data for each of combination of inactivation model and fitting algorithm. Visually, no significant differences among the survivor curves are observed. The similarities between the curves are confirmed by the obtained residual mean squared errors (RMSE, last column of Table 3), where the statistical information of the fit provided by the tool has been extracted. This table shows that similar values of RMSE have been obtained with every model. Table 3 also provides the values, standard deviations and confidence intervals of the model parameters estimated by the tool for each of the fittings performed. In every case, the values of the model parameters using either algorithm (nlr or MCMC) have deviations lower than 20%. Taking into account both the size and variance of the data set studied, such deviations are reasonable.
Peleg
nlr
MCMC
Dref (min) z (°C) N0 (CFU/ml) Dref (min) z (°C) N0 (CFU/ml) δref (min) p z (°C) N0 (CFU/ml) δref (min) p z (°C) N0 (CFU/ml) kb (1/°C) n Tc (°C) N0 (CFU/ml) kb (1/°C) n Tc (°C) N0 (CFU/ml)
Estimate
Confidence Interval
RMSE
5.65 ± 0.72 6.65 ± 0.93 598e5 ± 5.81e4
(4.12, 7.18) (4.70, 8.60) (4.75e5, 7.21e5)
0.09
5.64 ± 2.25 6.62 ± 2.50 1.61e5 ± 4.44e4 8.86 ± 6.41 1.30 ± 0.67 9.23 ± 6.50 6.02e5 ± 5.85e4 9.64 ± 4.86 1.40 ± 0.41 10.13 ± 2.01 6.09e5 ± 1.66e5 0.35 ± 0.07 1.06 ± 0.11 124.73 ± 1.19 6.01e5 ± 6.17e4 0.30 ± 0.10 2.07 ± 0.08 125.07 ± 4.69 6.59e5 ± 4.64e5
(4.22, 11.72) (3.49, 13.44) (1.46e5, 1.79e6)
0.10
(−4.72, 22.45) (−0.13, 2.72) (−4.55, 23.00) (4.78e5, 7.26e5)
0.09
(1.96, 18.52) (0.81, 2.28) (8.28, 14.86) (2.72e5, 5.69e6)
0.10
(0.20, 0.51) (0.83, 1.30) (122.21, 127.25) (4.70e5, 7.32e5)
0.09
(0.07, 0.42) (0.81, 1.09) (121.88, 141.35) (4.62e5, 2.12e6)
0.11
The standard deviation of the model parameters estimated using the nlr algorithm for Bigelow and Peleg models are lower than those estimated using the MCMC algorithm, resulting in a characterization of the microbial inactivation with lower uncertainty. On the other hand, the MCMC algorithm has provided a tighter description for Mafart
Fig. 3. Comparison between experimental results of non-isothermal inactivation of B. sporothermodurans (black squares) and fitted survivor curves. The left panel represents the curves calculated using the MCMC algorithm, while the right one illustrates those fitted using the nlr algorithm. Solid orange curves have been predicted using Bigelow model, dotted green curves using Peleg, short-dashed magenta using Mafart. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
A. Garre et al. / Food Research International 93 (2017) 66–74
model. However, from the magnitude of the differences and the amount of data considered, it is not possible to conclude that there were any significant differences between the algorithms in this particular case study. The nlr algorithm calculates negative values for the lower confidence interval of the parameters of the Mafart model. The inactivation models implemented in the package are defined only for positive values of the model parameters, thus, the negative lower limits predicted by the tool are spurious results. Confidence intervals for the model parameters are calculated under the assumption that they follow a t-distribution with the mean and standard deviations estimated by the fitting algorithm. Hence, the confidence intervals generated are always symmetrical. This assumption is obviously invalid for some of the cases studied. In these cases, the confidence intervals estimated using the MCCM algorithm, which does not make any assumption regarding the probability distribution of the model parameters, should be considered for the characterization. The issues of the non-linear regression algorithm for the estimation of the probability distributions of the model parameters are evident when trying to calculate prediction intervals of the survivor curve. The sample of model parameters generated under the normality assumption results in sets of unrealistic parameters which produce spurious simulations of the survivor curve. In consequence, the prediction interval calculated from the nlr adjustment looks ragged and, at some time points, it cannot be calculated. On the other hand, the MCMC algorithm calculates a better description of the probability of the model parameters, resulting in a better sample. Fig. 4 depicts the prediction intervals calculated based on each one of the 3 models fitted. Note that, this methodology can be used to generate predictions intervals for any temperature profile. Van Zuijlen et al. (2010) estimated z-values between 8.25 and 8.31 for this microorganism under similar conditions. This value is within the confidence interval shown in Table 3. They estimated a δ120 of 5.66, similar to the D-values provided in Table 3 but lower than the δvalues estimated in Table 3, although this value is within the confidence interval of the model parameter.
73
It is worth highlighting that the z-values estimated using Bigelow model are lower than those estimated using Mafart model. This is due to the fact that, while during an isothermal process the effect of each model parameter can be easily separated, during a dynamic one it cannot. Hence, the model parameters are highly correlated and changes in one of them implies variations on the rest. 4. Conclusions The accurate prediction of the inactivation steps during the food production chain is crucial for a quantitative risk analysis. Due to the variability and uncertainty of the microbial response, this can only be accomplished by following a rigorous (mathematical) modelling approach, i.e. by calculating prediction intervals. The generation of prediction intervals requires the use of advanced mathematical techniques which are able to estimate the variability and uncertainty of the microbial response through the probability distribution of model parameters. Bioinactivation includes functions for the modelling of non-isothermal microbial inactivation, namely the calibration of model parameters and the generation of predictions and prediction intervals based on the model fitted. These capabilities have been demonstrated through a case study where the non-isothermal inactivation of B. sporothermodurans has been characterized. The fitting has been made using both fitting algorithms available in the software, obtaining similar fitted curves. The estimated values of the model parameters are similar to those already published in the scientific literature. The tool is open-source and can be easily extended to include further environmental factors or to describe non-thermal inactivation. Bioinactivation core is written as a package for R, a general purpose programming language, and can be easily incorporated as part of other analyses. For instance, the results generated can be part of a quantitative microbiological risk assessment, since not only the fitted curve is obtained, but also a prediction interval is calculated. Although the case study presented in this paper is focused on food microbiology, the features implemented in bioinactivation can be useful for researchers in
Fig. 4. Prediction interval of the survivor curve for the temperature profile in Fig. 2 calculated based on the models fitted. The solid red line corresponds to the one calculated using Bigelow model, the green dashed line the one for Peleg model and the dotted blue line the one for Mafart model. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
74
A. Garre et al. / Food Research International 93 (2017) 66–74
other fields. For instance, pharmaceutical or biotechnology research can also benefit from the features implemented in this tool. Acknowledgements The financial support of this research work was provided by the Ministry of Economy and Competitiveness (MINECO) of the Spanish Government and European Regional Development Fund (ERDF) through projects AGL2013-48993-C2-1-R and DPI2011-28112-C04-04. Alberto Garre (BES-2014-070946) is grateful to the MINECO for awarding him a pre-doctoral grant. Appendix A. Supplementary data Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.foodres.2017.01.012. References Aspridou, Z., & Koutsoumanis, K. P. (2015). Individual cell heterogeneity as variability source in population dynamics of microbial inactivation special Issue on Predictive modelling in food. Food Microbiology, 45(Part B), 216–221. http://dx.doi.org/10.1016/j.fm.2014. 04.008. Bigelow, W. (1921). The logarithmic nature of thermal death time curves. Journal of Infectious Diseases, 29, 528–536. Cattani, F., Dolan, K. D., Oliveira, S. D., Mishra, D. K., Ferreira, C. A. S., Periago, P. M., ... Valdramidis, V. P. (2016). One-step global parameter estimation of kinetic inactivation parameters for Bacillus sporothermodurans spores under static and dynamic thermal processes. Food Research International, 89, 614–619. http://dx.doi.org/10.1016/j. foodres.2016.08.027. Chang, W., Cheng, J., Allaire, J., Wie, Y., & McPherson, J. (2015). R package (CRAN), shiny: Web application framework for R. R package version 0.11.1. Conesa, R., Andreu, S., Fernández, P. S., Esnoz, A., & Palop, A. (2009). Nonisothermal heat resistance determinations with the thermoresistometer Mastia. Journal of Applied Microbiology, 107(2), 506–513. http://dx.doi.org/10.1111/j.1365-2672.2009.04236.x. R Core Team (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Dolan, K. D. (2003). Estimation of kinetic parameters for nonisothermal food processes. Journal of Food Science, 68(3), 728–741. Dolan, K. D., Valdramidis, V. P., & Mishra, D. K. (2013). Parameter estimation for dynamic microbial inactivation: Which model, which precision? Food Control, 29, 401–408. Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. London: Chapman & Hall. Garre, A., Fernández, P. S., & Egea, J. A. (2015). R package (CRAN), bioinactivation: Simulation of dynamic microbial inactivation. R package version 1.1.0. Geeraerd, A. H., Herremans, C. H., & Van Impe, J. F. (2009). Structural model requirements to describe microbial inactivation during a mild heat treatment. International Journal of Food Microbiology, 59, 185–209. http://dx.doi.org/10.1016/S0168-1605(00)00362-7.
Haario, H., Laine, M., Mira, A., & Saksman, E. (2006). Dram: Efficient adaptive MCMC. Statistics and Computing, 16(4), 339–354. http://dx.doi.org/10.1007/s11222-0069438-0. Haas, C. N., Rose, J. B., & Gerba, C. P. (2014). Quantitative microbial risk assessment. New Yersey: John Willey & Sons. http://dx.doi.org/10.1002/9781118910030. Hassani, M., Condon, S., & Pagan, R. (2007). Predicting microbial heat inactivation under nonisothermal treatments. Journal of Food Protection, 70(6), 1457–1467. Hindmarsh, A. C. (1983). In R. W. Stepleman (Eds.), ODEPACK, a systematized collection of ODE solvers. 1983. (pp. 55–64). North-Holland, Amsterdam: Scientific Computing. Huertas, J. P., Aznar, A., Esnoz, A., Fernandez, P. S., Iguaz, A., Periago, P. M., & Palop, P. M. (2016). High heating rates affect greatly the inactivation rate of Escherichia coli. Frontiers in Microbiology, 7, 1256. http://dx.doi.org/10.3389/fmicb.2016.01256. Janssen, M., Verhulst, A., Valdramidis, V., Devlieghere, F., Van Impe, J. F., & Geeraerd, A. H. (2008). Inactivation model equations and their associated parameter values obtained under static acid stress conditions cannot be used directly for predicting inactivation under dynamic conditions. International Journal of Food Microbiology, 128(1), 136–145. Leontidis, S., Fernández, A., Rodrigo, C., Fernández, P. S., Magraner, L., & Martínez, A. (1999). Thermal inactivation kinetics of Bacillus stearothermophilus spores using a linear temperature program. Journal of Food Protection, 62(8), 958–961. Mafart, P., Courvert, O., Gaillard, S., & Leguerinel, I. (2002). On calculating sterility in thermal preservation methods: Application of the Weibull frequency distribution model. International Journal of Food Microbiology, 72(1–2), 107–113. http://dx.doi.org/10. 1016/S0168-1605(01)00624-9. Moré, J. J. (1978). The Levenberg-Marquardt algorithm: Implementation and theory. In G. A Watson (Ed.), "Numerical Analysis: Proceedings of the Biennial Conference Held at Dundee, June 28–July 1, 1977". (pp. 105–116). Berlin: Springer ISBN 978-3-54035972-2, http://dx.doi.org/10.1007/BFb0067700. Muñoz-Cuevas, M., Metris, A., & Baranyi, J. (2012). Predictive modelling of salmonella: From cell cycle measurements to e-models. Food Research International, 45(2), 852–862. http://dx.doi.org/10.1016/j.foodres.2011.04.033. Peleg, M., & Cole, M. B. (1998). Reinterpretation of microbial survival curves. Critical Reviews in Food Science, 38, 353–380. Perez-Rodriguez, F., & Valero, A. (2012). Predictive microbiology in foods. New York: Springer. Soetaert, K., & Petzoldt, T. (2010). Inverse modelling, sensitivity and Monte Carlo analysis in R using package FME. Journal of Statistical Software, 33(3), 1–28. Soetaert, K., Petzold, T., & Setzer, W. (2010). Solving differential equations in R: Package deSolve. Journal of Statistical Software, 33(9), 1–25. Stumbo, C. (1973). Thermobacteriology in food processing. New York: Academic Press. Tenenhaus-Aziza, F., & Ellouze, M. (2015). Software for predictive microbiology and risk assessment: A description and comparison of tools presented at the ICPMF8 Software Fair. Food Microbiology, 45, 290–299. http://dx.doi.org/10.1016/j.fm.2014.06.026. Van Zuijlen, A., Periago, P. M., Amezquita, A., Palop, A., Brul, S., & Fernández, P. S. (2010). Characterization of Bacillus sporothermodurans IC4 spores; putative indicator microorganism for optimisation of thermal processes in food sterilisation. Food Research International, 43(7), 1895–1901. http://dx.doi.org/10.1016/j.foodres.2009.11.011. Whiting, R. C. (1995). Microbial modelling in foods. Critical Reviews in Food Science and Nutrition, 35(6), 464–494.