absence data

absence data

Food Research International 131 (2020) 109040 Contents lists available at ScienceDirect Food Research International journal homepage: www.elsevier.c...

970KB Sizes 3 Downloads 25 Views

Food Research International 131 (2020) 109040

Contents lists available at ScienceDirect

Food Research International journal homepage: www.elsevier.com/locate/foodres

Probabilistic model for estimating Listeria monocytogenes concentration in cooked meat products from presence/absence data

T

Wanxia Suna,b, Tianmei Suna, Xiang Wanga, Qing Liua, Qingli Donga,



a b

School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200024, China

ARTICLE INFO

ABSTRACT

Keywords: Pathogens Zero-inflated models Over-dispersion Prevalence Qualitative data Exposure assessment

A quantitative probabilistic model was developed to estimate the concentration of Listeria monocytogenes in cooked meat products based on presence/absence data and an assumed zero-inflated distribution, i.e. zeroinflated Poisson (ZIP) or zero-inflated Poisson lognormal (ZIPL) distribution. The performance of these two distributions was compared in two data sets (data set A and B), which represented L. monocytogenes prevalence and concentrations in cooked meat products. In this study, L. monocytogenes contamination data consisted of 4.23% (8/189) and 4.17% (5/120) non-zero counts for data set A and B, respectively. The contamination level of L. monocytogenes, determined by the most probable number (MPN) technique, ranged from 3 to 93 MPN/g among 13 positive samples. The goodness-of-fit test indicated that the ZIPL distribution was better than the simpler ZIP distribution, when L. monocytogenes contamination levels on positive cooked meat samples illustrated large heterogeneity. Results obtained from ZIPL distribution showed that the logarithmic mean value of L. monocytogenes positive samples was 1.5 log MPN/g (log σ = 0.4) for data set A and B. This study provides an alternative probabilistic method when only qualitative data is available in Quantitative microbial risk assessment (QMRA), in particular if pathogen concentrations consist of large numbers of zero counts and represent high variability.

1. Introduction

QMRA helps to estimate the probability and severity of a health disturbance due to ingestion of a particular pathogen conveyed by specific food products (Lindqvist & Westöö, 2000). In addition, QMRA is often performed to make reasonable decisions aiming to reduce public health risks (Membré & Boué, 2018). The steps in a complete QMRA are composed of four modules: (a) hazard identification, (b) hazard characterization, (c) exposure assessment, and (d) risk characterization. Implementation of quantitative exposure assessment depends on representative quantitative values to characterize the concentration of pathogens in food products (Jongenburger, Bassett, Jackson, Gorris et al., 2012; Petterson, Dumoutier, Loret, & Ashbolt, 2009; Pouillot, Hoelzer, Chen, & Dennis, 2013). In practice, analyses of pathogens are often based on presence/absence detection data as a result of the low concentration of the target micro-organism (Andritsos, Mataragas, Paramithiotis, & Drosinos, 2013; Williams & Ebel, 2012). Therefore, a statistical inference based on qualitative results should be carried out to obtain quantifiable concentration estimates. The conventional and simple Poisson distribution, i.e. mean equals to variance of the observed count data, has been widely used to describe pathogen concentrations in QMRA (Crépe, Albert, Dervin, & Carlin,

Listeria monocytogenes is widely distributed in the environment and it is able to grow in a wide range of temperatures (from 0 to 45 °C) (Mataragas, Zwietering, Skandamis, & Drosinos, 2010; Sauders & Wiedmann, 2007). Outbreaks and sporadic cases of listeriosis are predominantly associated with ready-to-eat (RTE) foods, as they are intended for consumption without any decontaminating treatment (European Food Safety Authority [EFSA], 2018; EFSA & European Centre for Disease Prevention and Control [ECDC], 2017; World Health Organization/Food and Agriculture Organization of the United Nations [WHO/FAO], 2004). A particularly important RTE typology is represented by cooked meat products: these foodstuffs are characterized by an intense manipulation during the production process, as well as by cross-contamination at retail or the consumers’ home, which increase the risk of listeriosis (Farber, Pagotto, & Scherf, 2007; Zhang et al., 2018). Quantitative microbial risk assessment (QMRA) is a valuable tool for understanding, managing and reducing foodborne pathogen risks in the area of food safety (EFSA, 2018; Tirloni et al., 2018). Conducting a



Corresponding author. E-mail address: [email protected] (Q. Dong).

https://doi.org/10.1016/j.foodres.2020.109040 Received 10 October 2019; Received in revised form 14 January 2020; Accepted 26 January 2020 Available online 27 January 2020 0963-9969/ © 2020 Elsevier Ltd. All rights reserved.

Food Research International 131 (2020) 109040

W. Sun, et al.

2007; Song, Pei, Xu, Yang, & Zhu, 2015). Based on the assumption of Poisson distribution, Jarvis (2000) proposed an effective equation to evaluate the mean concentration of microorganisms in a batch of samples according to presence/absence data. This equation has been widely applied in quantitative microbiol exposure assessment for estimating the pathogen concentration in liquid food when only qualitative detection results are available (Ding, Yu, Schaffner, Chen, & Ye, 2016). Meanwhile, this equation has been employed to evaluate the concentration of pathogens under the detection limit for solid food such as raw meat and cooked meat products (Mataragas et al., 2010; Valero et al., 2014). However, L. monocytogenes contamination levels in cooked meat products are usually highly variable (Angelidis & Koutsoumanis, 2006; Ross, Rasmussen, Fazil, Paoli, & Sumner, 2009), which is contradictory to the assumption of the Poisson model that microbes are randomly distributed. Therefore, the variability of Poisson distribution is insufficient to describe the heterogeneity of L. monocytogenes concentrations (Williams & Ebel, 2012). When only qualitative data are available, quantifying L. monocytogenes concentrations based on appropriate statistical distributions has been identified as a considerable challenge to undertake QMRA. The variability of L. monocytogenes concentrations arises from (1) the possibility that a fraction of cooked meat is categorically uncontaminated, and (2) the degree of contamination varying among contaminated batches of products. These practical phenomena lead to corresponding statistical results in empirical data including (1) the occurrence of many zeros (more than expected from a Poisson distribution), and (2) the over-dispersion phenomenon of positive count data with variance greater than expected under the Poisson model (Mian & Paul, 2016). Apart from the variability there is also uncertainty in the concentration, which may be a further complicating factor (Zwietering, 2015). Uncertainty is defined as an imperfect state of knowledge about an associated parameter or a model to reflect the situation (Gay & Korre, 2009). One of the most obvious sources of uncertainty in a quantitative exposure assessment is related to sampling and measurement errors (Duarte, Stockmarr, & Nauta, 2015; Gay & Korre, 2009). For example, in L. monocytogenes presence/absence or enumeration tests, the detection limit could result in the uncertainty of the overall characterization of the sample's contamination. Furthermore, inaccuracy of a statistical model (or its parameters) that is used to describe variability of L. monocytogenes concentrations could increase an uncertainty of concentration estimates. Therefore, when attempting to accurately characterising the variability of L. monocytogenes concentrations, the statistical approach should be able to account for the data over-dispersion phenomenon. In recent years, statistical distributions that adequately model L. monocytogenes concentrations are often related to the zero-inflated distribution (Pouillot et al., 2013). Ridout, Demetrio, and Hinde (1998) recommended that the zero-inflated Poisson (ZIP) distribution outperforms the zero-inflated negative binomial (ZINB) distribution to model over-dispersed data. Gonzales-Barron, Kerr, Sheridan, and Butler (2010) demonstrated that the ZIP distribution could be more appropriate than the Poisson or negative binomial distribution when modelling low microbial counts in foods. Furthermore, logarithmic transformation could be suitable for high variability of positive counts, which leads to the zero-inflated Poisson lognormal (ZIPL) distribution. To our knowledge, however, hierarchical models and their zero-inflated counterparts have not been developed for concentration estimates in cooked meat products based on qualitative data. The objectives of the present study are (i) to evaluate the concentration of L. monocytogenes in cooked meat products based on the available qualitative data, and (ii) to construct and compare two zeroinflated models (ZIP and ZIPL distributions) in transferring qualitative data into quantitative estimates.

2. Materials and methods 2.1. Sample collection A total of 309 samples of cooked meat products were collected in Shanghai municipality, P. R. China. The sample sizes were 189 in 2017 (data set A) and 120 in 2018 (data set B), respectively. Samples were purchased from local retailers and catering restaurants. Representative samples were collected according to randomized sampling method. To account for seasonal variation in L. monocytogenes growth characteristics, 12 different samplings were conducted between July 2017 and August 2018, once a month except January and February. In each sampling, the number of samples was 30 and 20 in 2017 and 2018, respectively. Specifically, 39 samples were collected in September 2017. The cooked meat samples included 284 meat with sauce samples, 7 fried meat samples, 2 sausage samples and 16 cooked dried meat samples. Each sample was taken with 250 g and put into a commercial bag separately. After collecting cooked meat products, samples were transported directly to a standard laboratory for food microbiology at 4 °C for testing within 2 h. In Data set A, quantitative data on L. monocytogenes concentrations have been published for exposure assessment in a previous study (Sun et al., 2019). 2.2. Qualitative and quantitative analysis of L. monocytogenes Cooked meat products were aseptically removed from commercial bags. A total of 25 g sample unit was selected from different parts randomly. The presence/absence test and enumeration of L. monocytogenes were conducted by reference to China national food safety standards GB 4789.30–2016 (National Health and Family Planning Commission of China [NHFPC], 2016). For qualitative analysis, the collected sample unit was homogenized and incubated in 225 mL of Listeria Enrichment Broth I culture (LB1) for 24 ± 2 h. Subsequently, 0.1 mL of LB1 enrichment culture was added to Listeria Enrichment Broth II culture (LB2) for a second 24 ± 2 h enrichment. The resulting limit of detection (LOD) of the presence/absence test corresponds to the count of 1 CFU per 25 g, i.e. 0.04 CFU/g. For quantitative analysis, the nine-tube most probable number (MPN) technique was used to quantify the concentration of L. monocytogenes in a sample by means of replicate liquid broth growth in tenfold dilutions (NHFPC, 2016). The nine tubes were divided into three sets of three tubes each. Briefly, 1 mL samples of serial dilutions from 10−1 to 10−3 with three replicates each were incubated in 10 mL of LB1, followed by incubation at 30 ± 1 °C for 24 ± 2 h. Subsequently, a 0.1 mL of subsamples from each tube was transferred to a new tube containing 10 mL of LB2 and incubated at 30 ± 1 °C for an additional 24 ± 2 h. Then 0.333 mL of inoculum was surface-plated onto the Listeria Chromogenic Agar and PALCAM and incubated at 37 °C during 24–48 h for enumeration. The MPN value was determined based on the number of positive or negative tube(s) in each of the three sets and the standard MPN table. The theoretical limit of quantitation (LOQ) equals to 1 CFU in one single replicate plate of any single dilution, which corresponds to 1 CFU/0.333 mL = 3 MPN/mL. 2.3. Quantitative derivation of L. monocytogenes concentrations by qualitative data For a set of data on L. monocytogenes contamination levels, a high percentage of zero counts is present. With the variability of L. monocytogenes concentrations in cooked meat products, a mixture of two distributions, a degenerate distribution for the portion of non-detects (< LOD) and a standard Poisson distribution, is developed. This type of model is called ZIP distribution (Chebon, Faes, Cools, & Geys, 2017). Usually, the ZIP distribution is used to describe the overall batch of products that one group is without contamination, and the other group is with contamination in a random pattern (Jongenburger, Bassett, 2

Food Research International 131 (2020) 109040

W. Sun, et al.

Jackson, Zwietering et al., 2012). Recently, this model is commonly used when the variance of viable counts is higher than the mean due to a high proportion of non-detections (Nohra, Grinberg, Marshall, Midwinter, Collins-Emerson, & French, 2019). In the ZIP model, excess zeros can be categorized into structural zeros (or true zeros) and random zeros (or false zeros): zeros in case of complete absence of contamination are modeled as structural zeros, while random zeros result from samples contaminated under the detection limit (Park et al., 2015). The probability mass function of the ZIP distribution is given by:

P (yi ) =

p0 + (1 (1

p0 )

p0 )exp( ( i v) yi exp( yi !

i v ),

yi < LOD

i v ),

yi

LOD

the ZIP distribution or hyper parameters ( µ , ) in the ZIPL distribution. Let p0 = uniform(0, 0.2) and = uniform(12, 25) denote prior distributions for the parameter in the ZIP model (Gkogka, Reij, Gorris, & Zwietering, 2013; Gómez et al., 2015; Paudyal et al., 2018; Prencipe et al., 2012). Additionally, Lnµ = uniform(2.5, 3.2) is used to present the mean of the hyper-prior distribution for parameters in the ZIPL model. In the absence of sufficient prior knowledge for individual parameter, relatively uninformative (vague) prior distributions are usually specified to represent the lack of previous knowledge (ElBasyouny & Sayed, 2009). An uninformative prior distribution, Ln = uniform(0.01, 0.1), is assigned for the standard deviation parameter. Posterior probability distributions are sampled using the Markov chain Monte Carlo (MCMC) technique in the statistical software program R version 3.5.1 (http://www.R-project.org).

(1)

where yi represents the colony count at observation i (note that i = 1, 2, …, n observations); p0 is the probability for the occurrence of zero counts resulting from uncontaminated sample units; i is the contamination level in the batch of cooked meat from which the i th sample was drawn; and v is the quantity of analyzed samples. The quantity of the screening sample used in this study is v = 25 g. It is assumed that samples are independent distributed in a batch of meat products and the probability of L. monocytogenes being detected at sample i is based on a binomial random variable. Consider, for example, a batch of meat samples is comprised of n independent individuals. If none of the n individuals become contaminated with L. monocytogenes, then the batch as whole does not become contaminated. However, if at least one of the n individuals is contaminated, then this batch of products is considered to have become contaminated. The probability that m samples are negative-detected and n m samples are positive individuals in the batch simply follows a Binomial distribution. Under the assumption of the ZIP model, it is readily seen that the probability of failing to detect the data is p0 + (1 p0 )exp( v ) , leading to the fact that the probability of m non-detect data being included in n samples is given by Bin(m; n; ( p0 + (1 p0 ) exp ( v ) )). The probabilistic mass function of the model is presented as follows:

( )

n [p + (1 p= m 0

p0 ) exp(

v )]m [(1

p0 )(1

exp(

v ))]n

m

2.5. Model comparisons and application The parameters of ZIP and ZIPL distributions are estimated according to qualitative or quantitative data. To evaluate the accuracy of the qualitative data transformation method, the statistical significance test is used by comparing predicted values of these two models with observed data. The Kolmogorov-Smirnov (K-S) test is a non-parametric test of goodness-of-fit for the cumulative distribution of data samples (Zhang, Wang, Liang, & Liu, 2010). It compares the empirical distribution function (EDF) with a theoretical cumulative distribution function to find out whether both random variables draw from an identical distribution, or they come from different distributions (Hassani & Silva, 2015). In this study, K-S test for goodness-of-fit is used to compare fitted ZIP and ZIPL distribution functions with the EDF. The model with the highest probability value would be the best one to predict a repetitive data set which has the same structure as the observed data. The EDF is calculated from the data by sorting n collected enumerations in increasing order, assigning order number i from 1 to n (Reinders, de Jonge, & Evers, 2003). The cumulative probability (i /n ) can be conducted according to the sequential data.

(2)

where n is the number of total samples; m is the number of non-detects; and is the concentration of L. monocytogenes in the sample. The maximum likelihood estimation (MLE) method was used to estimate values for parameters that are most likely to generate the observed measurement (Busschaert, Geeraerd, Uyttendaele, & Van Impe, 2010). From Eq. (2), the MLE for can be obtained by: p

( )

n = m [v (p0 + (m

1) exp(

n )[p0 + (1

v )][m [p0 + (1 p0 ) exp(

p0 ) exp(

v )]m [(1

p0 )(1

v )]m 1 [(1 exp(

p0 )(1

exp(

v ))]n m 1 ] = 0

3. Results and discussion 3.1. Prevalence and concentrations of L. monocytogenes in cooked meat products L. monocytogenes positive samples classified by the time of sampling plan and by the different detection method are shown in Table 1. The prevalence of L. monocytogenes in cooked meat samples is 4.23% (8/ 189) in 2017 and 4.17% (5/120) in 2018, respectively. The data set includes excessive non-detects consisting of uncontaminated samples and false negative samples. Due to the different detection limit in qualitative and quantitative test methods, the prevalence of L. monocytogenes in data set A is different according to these two detection methods. Meanwhile, data set B shows the same prevalence of L. monocytogenes in two detection methods. It means that L. monocytogenes contamination levels are not between LOD and LOQ in data set B. The qualitative detection results in both data sets are corresponding to the prevalence reported from other studies on L. monocytogenes in cooked

v ))]n m

(3)

therefore,

=

m 1 Ln( v n

np0 ) np0

(4)

In order to model over-dispersion phenomenon of the positive contamination concentration, Ln( ) Normal(µ , ) can be further assumed. 2.4. Bayesian approach to derive L. monocytogenes concentration In this study, the parameter estimates of statistical distributions for L. monocytogenes concentrations were developed by a Bayesian inference. Rather than offering a point estimation, Bayesian analysis generates posterior results in the form of probabilistic distributions, which provides quantitative description of uncertainty in the estimated model parameter. A prior distribution of model parameters is necessary for Bayesian analysis to infer the model parameter in a form of posterior distributions. To obtain the posterior estimates of model parameters ( p0 , , µ , ), it is critical to characterize prior distributions for p0 and either in

Table 1 Number of L. monocytogenes positive samples in data sets A and B. Data set

A B

3

Sampling date

2017 2018

Number

189 120

L. monocytogenes positive (%) Qualitative detection

Quantitative detection

8 (4.23) 5 (4.17)

2 (1.06) 5 (4.17)

Food Research International 131 (2020) 109040

W. Sun, et al.

Table 2 Numbers of cooked meat samples positive for L. monocytogenes. Date set

Retail/catering

Concentration (MPN/g)

A A B B B B B

Catering Catering Retail Retail Retail Retail Retail

9.2 75 3.6 93 7.4 7.2 3.0

Table 3 Parameter estimates of the zero-inflated Poisson (ZIP) and zero-inflated Poisson lognormal (ZIPL) distributions fitted to the qualitative and quantitative data. Data set

meat products in China (Yang et al., 2018). Additionally, the prevalence of L. monocytogenes found in this study is also similar to that reported for cooked meat products at retail in south China (5.04%) (Chen, Wu, Zhang, Yan, & Wang, 2014). Such a discrepancy in the prevalence of L. monocytogenes may result from the different sampling time, sampling regions, and uncertainty during detection. The MPN values of L. monocytogenes on positive samples are shown in Table 2. For data set A, L. monocytogenes contamination levels in positive cooked meat samples are 9.2 and 75 MPN/g. The observed concentration of L. monocytogenes ranges from 3 to 93 MPN/g for data set B. Clearly, the mean concentration of L. monocytogenes positive samples is 42.1 MPN/g (σ = 46.5) and 22.8 MPN/g (σ = 39.3) for data set A and data set B, respectively. Therefore, L. monocytogenes contamination levels illustrate over-dispersion phenomenon in data sets A and B. Previous studies also showed considerable variability in L. monocytogenes counts even within a single batch (Carpentier & Cerf, 2011; Jongenburger, Bassett, Jackson, Zwietering et al., 2012), thus quantitative estimates based on qualitative data should include large heterogeneity.

Data type

ZIP distribution

ZIPL distribution

p0 (%)

¯ (MPN/g)

p0 (%)

µ¯ (log MPN/g)

¯

Data set A

Qualitative data Quantitative data

95.77 98.94

16.1 42.1

95.77 98.94

1.5 1.4

0.4 0.5

Data set B

Qualitative data Quantitative data

95.83 95.83

16.8 22.8

95.83 95.83

1.5 1.0

0.4 0.5

from qualitative data is higher than the ones estimated from quantitative data. Although the ZIP distribution is easier to calculate L. monocytogenes contamination levels from presence/absence results, it could underestimate the concentration level when only qualitative data is available. The interval estimate is demonstrated by the posterior probabilistic and µ shown in Fig. 1. Posterior distributions are the density of predictive probability of the model parameter given the observed dataset. The posterior predictive distribution in this paper is the contamination level of L. monocytogenes, based on the model parameter inferred from a qualitative dataset. In the ZIP and ZIPL distributions, hyper-parameters are used to describe the uncertainty of L. monocytogenes contamination levels. The posterior distribution accounts for the updated uncertainty in the model hyperparameters. In the Bayesian framework, hyperprior distributions provide the basis for inferring posterior distributions. The shape of the posterior distribution depended on the prior information, the observed dataset and the likelihood function. An informative prior may have weak influence on the posterior distribution if the weight of evidence provided by newly collected data is strong (Schmidt, 2010). In contrast, a relatively uninformative prior contributes minimal subjective information to the posterior distribution, which means the posterior distribution is determined mainly by the available data (Schmidt, 2010). When lack of informative priors and expert opinions on parameters, convenient uninformative prior distributions are valuable for all unknown model parameters in the likelihood function. The simplistic and widely used model is included in this study for comparison. Under the assumption that the contamination level follows a Poisson distribution, the concentration of pathogen could be properly Sneg 1 assessed through the equation (Jarvis, 2000), i.e. = v Ln( S ) , total where Sneg , the number of samples tested as negative; and Stotal , the total number of samples analyzed. For data sets A and B, the prevalence of L. monocytogenes is less than 5%, thus the parameter representing the mean concentration ( ) is less than −2.7 log MPN/g. Fig. 2 shows the cumulative mass function for L. monocytogenes concentrations, when equals to −2.7 log MPN/g. The percentage of zero counts is 99.80% according to the estimated outcome. The theoretical result shows that positive samples (> LOQ) could not be predicted according to this method. The significant bias in parameter estimates derived from the Jarvis equation method makes it difficult to recommend its use, when microbial cells in food is not randomly distributed but heterogeneously aggregated or clustered. The only practical advantage for the Jarvis equation is that no prior distribution needs to be specified or defended, thus it is very easy to use. Despite the apparent complexity of the mathematical models introduced in this study, the qualitative data transformation method presented in this work could be easily extended to any pathogen by simply changing the prior information.

3.2. Estimation of L. monocytogenes concentrations from qualitative or quantitative data Assuming that the microbial concentration in contaminated products followed a lognormal distribution, it was frequent in the 1990s and early 2000s to calculate a log mean concentration based only on enumeration results (Abadias, Cañamás, Asensio, Anguera, & Viñas, 2006). Ignoring non-detects or replacing censored values by the LOD may result in bias for both μ and σ estimators. (Lorimer & Kiermeier, 2007; Shorten, Pleasants, & Soboleva, 2008). Especially, if prevalence is less than 25%, the lognormal distribution gives poor fitting results (Commeau, Parent, Delignette-Muller, & Cornu, 2012). In the present study, zero counts in L. monocytogenes detection results should be considered; otherwise, a biased estimation could be obtained. Additionally, the variability and uncertainty of L. monocytogenes concentrations could be separated according to a Bayesian approach (Pouillot & Delignette-Muller, 2010), which helps to acquire a more accurate estimation result. By combining the probabilistic approach proposed in this study with the Bayesian approach, the parameter estimates of ZIP and ZIPL distributions are calculated from the qualitative data (Table 3). The posterior mean value is used as the point estimate for the concentration of L. monocytogenes. For data set A, the predicted concentration of L. monocytogenes, as fitted by ZIP model, has a mean value of 16.1 MPN/g and 42.1 MPN/g according to qualitative and quantitative data, respectively. In the ZIP distribution, the estimated parameter is considerably shifted to the left by presence/absence data; thus, tending to produce the lower concentration value. As modelled by the ZIPL distribution, the estimated logarithmic mean value for positive samples is 1.5 log MPN/g (log σ = 0.4) and 1.4 log MPN/g (log σ = 0.5) by qualitative and quantitative data, respectively. Additionally, for data set B, the ZIP and ZIPL distributions are also fitted to calculate parameters ( , µ , ) by using the proposed method. Above procedure demonstrates that the logarithmic mean value of ZIPL distribution derived

3.3. Comparison of the cumulative probability of ZIP and ZIPL distributions Fig. 3 shows the cumulative probabilistic distribution of L. 4

Food Research International 131 (2020) 109040

W. Sun, et al.

Fig. 1. Posterior densities of parameters λ in the zero-inflated Poisson distribution (A) and μ in the zero-inflated Poisson lognormal (B).

Fig. 2. The cumulative mass function for L. monocytogenes concentrations according to the Jarvis equation, when the prevalence equals to 5%.

monocytogenes concentrations in data sets A and B, which allows an overall comparison between the ZIP and ZIPL models for qualitative and quantitative data. When L. monocytogenes prevalence is different by two detection methods, it is observed that the ZIP and ZIPL distributions tend to overestimate the probability of zero counts according to the quantitative data. However, relatively low contamination levels of L. monocytogenes on products at the retail phase are highly significant if they have a chance to proliferate to hazardous numbers during periods of temperature abuse (Campagnollo, Gonzales-Barron, Cadavez, Sant’ana, & Schaffner, 2018). Therefore, the available qualitative data used for estimating concentrations will provide useful information for QMRA. The box plot presents the concentration of L. monocytogenes on positive samples according to ZIP and ZIP distributions (Fig. 4). When L. monocytogenes contamination levels are estimated by the ZIP distribution with qualitative data, the mean value of L. monocytogenes concentrations is found to be lower than that in observed dada. Furthermore, according to qualitative data, the median value of L. monocytogenes concentrations modelled by the ZIPL distribution is higher than the one predicted by the ZIP model, while it is opposite for quantitative data. To ensure consumers’ safety, it is appropriate to use ZIPL distribution to estimate L. monocytogenes concentration based on available presence/absence data in QMRA. By comparing the goodness-of-fit of the statistical distribution via the K-S test, we are able to obtain a richer understanding of the underlying characteristics of L. monocytogenes distribution which in turn enables a more efficient and accurate estimation. K-S test shows that the ZIP and ZIPL distributions are not significantly different from the EDF (Table 4). This suggests that for qualitative and quantitative data, both distributions appear to represent the variability in L. monocytogenes concentrations on cooked meat products. The zero-inflated Bayesian hierarchical models are robust methods to estimate the concentration of L. monocytogenes. Earlier, Masago et al. (2006) compared the effectiveness of the Poisson distribution with the Poisson lognormal distribution in evaluating the health risk caused by noroviruses in drinking water based on the available qualitative data in Japan. This study showed that the Poisson lognormal method gives better concentration

Fig. 3. Cumulative mass distributions of L. monocytogenes on cooked meat products as modelled by the zero-inflated Poisson (ZIP) and zero-inflated Poisson lognormal (ZIPL) models for qualitative and quantitative data in two data sets.

estimates of microorganisms than the Poisson method. Similarly, Petterson et al. (2009) developed a hierarchical Bayesian framework for describing variability in pathogen concentrations from presence/absence observations for E. coli O157:H7. The result pointed out that a single distribution would be inadequate to describe the concentration in the surface water over the entire year. In this work, the result of K-S analysis favours the ZIPL distribution by apparently a small difference of probabilities. This proves statistically that L. monocytogenes concentrations in positive samples do not distributed randomly. Regardless of mathematical complexity, the ZIPL distribution is more appropriate in the presence of substantial zero counts and bacterial clustering. The assumption that organisms are randomly distributed within a single well mixed sample was considered reasonable; however, it is possible that organisms are clumped or dispersed within a batch of samples (Mussida, Gonzales-Barron, & Butler, 2013; Petterson et al., 5

Food Research International 131 (2020) 109040

W. Sun, et al.

ZIP distribution by qualitative data can be straightforward, while logarithmic transformation to induce data normality can be suitable for bacterial counts of high occurrence in positive samples (GonzalesBarron, Cadavez, & Butler, 2014). Here, it is worthy to mention that estimates of prevalence and concentration distributions that are used in exposure assessment can potentially have substantial impacts on quantitative risk assessment outcomes (Pouillot et al., 2013). Especially, risk assessment results provide increasing essential scientific basis for risk managers to make risk management decisions, including policy, legislation and standards (Wu, Liu, & Chen, 2018). The proposed method has been applied to L. monocytogenes data sets, and it can be easily extended to other target micro-organisms with a similar way. With appropriate informative prior distributions or subjective expert opinions, the model aims to be applicable to the type of data including large zero-counts and considerable heterogeneity. The limitation associated with such methods, however, is the assumed subjective priors may not be appropriate when there is a lack of objective prior distributions (Beaudequin et al., 2015). Further, based on the presence/absence data, the appropriate statistical distribution should be explored to estimate the concentration of pathogens in other kind of foods in the future. 4. Conclusion This study provided a novel data transformation method for estimating L. monocytogenes concentrations in cooked meat products based on qualitative detection data. This method was applied to quantitative monitoring results of L. monocytogenes in two data sets. Additionally, this method based on zero-inflated models could be a powerful tool in QMRA when quantitative data is scarce. Bayesian approach demonstrated the great flexibility to cope with complex calculation and data scarce problems, and facilitated the incorporation of prior information. Users of this method should carefully consider the appropriateness of prior probability distributions to avoid biased parameter estimates.

Fig. 4. Box plot graphic representation of L. monocytogenes concentrations on cooked meat positive samples estimated by the zero-inflated Poisson (ZIP) and zero-inflated Poisson lognormal (ZIPL) models based on qualitative (Qual) and quantitative (Quan) data in two data sets. The plot provides the minimum (bottom ‘-’ markers), 1st percentile (lower ‘×’ markers), 25th percentile (lower box limits), mean (‘•’ markers), 50th percentile (center lines), 75th percentile (upper box limits), 99th percentile (upper ‘×’ markers) and maximum (top ‘-’ markers). Table 4 Kolmogorov-Smirnov (K-S) analysis for the goodness-of-fit of zero-inflated Poisson (ZIP) and zero-inflated Poisson lognormal (ZIPL) distributions. Distribution

ZIP (qualitative data) ZIP (quantitative data) ZIPL (qualitative data) ZIPL (quantitative data)

CRediT authorship contribution statement Wanxia Sun: Methodology, Software, Formal analysis, Data curation, Writing - original draft, Visualization. Tianmei Sun: Validation, Investigation. Xiang Wang: Writing - review & editing. Qing Liu: Resources. Qingli Dong: Conceptualization, Resources, Writing - review & editing, Supervision, Project administration.

K-S analysis Dataset A

Dataset B

0.9711 0.9084 0.9998 0.9998

0.7624 0.7375 0.9994 0.9997

Acknowledgments This work has been financially supported by the National Natural Science Foundation of China (NSFC 31801455), and the China National Key Research and Development Program (2018YFC1602500).

2009). Given the concentration estimates of microorganisms were highly dependent upon the assumed model, the appropriate statistical model describing the spatial distribution of microorganisms should be carefully selected. Jongenburger, Bassett, Jackson, Zwietering et al. (2012) compared the performance of frequently used distributions, namely the Normal distribution, various types of the Poisson distribution, the lognormal distribution, the Gamma distribution, the negative binomial distribution, and the Poisson lognormal (PL) distribution for modeling spatial distributions of microorganisms. It is concluded that the PL distribution is likely to be a suitable distribution to reflect clustering in microbial distributions. In some circumstances, detection results may contain a large proportion of zeros. The PL distribution greatly overestimates the mean microbial concentration when it attempts to fit in a high frequency of zero counts (Gonzales-Barron & Butler, 2011). As a consequence, zero-inflated distributions should be preferred for low numbers of microorganisms. Making sense of zero counts by zero-inflated models has been often used for statistical treatment (Wang & Hailemariam, 2018). In the presence of zero counts in data sets, estimations of the parameter in the

Declaration of Competing Interest The authors declared that there is no conflict of interest. References Abadias, M., Cañamás, T. P., Asensio, A., Anguera, M., & Viñas, I. (2006). Microbial quality of commercial “Golden Delicious” apples throughout production and shelf-life in Lleida (Catalonia, Spain). International Journal of Food Microbiology, 108, 404–409. Andritsos, N. D., Mataragas, M., Paramithiotis, S., & Drosinos, E. H. (2013). Quantifying Listeria monocytogenes prevalence and concentration in minced pork meat and estimating performance of three culture media from presence/absence microbiological testing using a deterministic and stochastic approach. Food Microbiology, 36, 395–405. Angelidis, A. S., & Koutsoumanis, K. (2006). Prevalence and concentration of Listeria monocytogenes in sliced ready-to-eat meat products in the Hellenic retail market. Journal of Food Protection, 64(4), 938–942. Beaudequin, D., Harden, F., Roiko, A., Stratton, H., Lemckert, C., & Mengersen, K. (2015).

6

Food Research International 131 (2020) 109040

W. Sun, et al.

Mussida, A., Gonzales-Barron, U., & Butler, F. (2013). Effectiveness of sampling plans by attributes based on mixture distributions characterising microbial clustering in food. Food Control, 34, 50–60. NHFPC (2016). National food safety standard food microbiological examination: Listeria monocytogenes. http://bz.cfsa.net.cn/staticPages/BC60573F-7E60- 4E3F-AC6C19E2484D73CD.html. Nohra, A., Grinberg, A., Marshall, J. C., Midwinter, A. C., Collins-Emerson, J. M., & French, N. P. (2019). Shifts in the molecular epidemiology of Campylobacter jejuni infections in a sentinel region of New Zealand, following the implementation of food safety interventions by the poultry industry. Applied and Environmental Microbiology. https://doi.org/10.1128/AEM.01753-19. Park, S., Navratil, S., Gregory, A., Bauer, A., Srinath, I., Szonyi, B., et al. (2015). Multifactorial effects of ambient temperature, precipitation, farm management, and environmental factors determine the level of generic Escherichia coli contamination on preharvested spinach. Applied and Environmental Microbiology, 81(7), 2635–2650. Paudyal, N., Pan, H., Liao, X., Zhang, X., Li, X., Fang, W., et al. (2018). A meta-analysis of major foodborne pathogens in Chinese food commodities between 2006 and 2016. Foodborne Pathogens and Disease, 15(4), 187–197. Petterson, S. R., Dumoutier, N., Loret, J. F., & Ashbolt, N. J. (2009). Quantitative Bayesian predictions of source water concentration for QMRA from presence/absence data for E. coli O157:H7. Water Science and Technology, 59, 2245–2252. Pouillot, R., & Delignette-Muller, M. L. (2010). Evaluating variability and uncertainty separately in microbial quantitative risk assessment using two R packages. International Journal of Food Microbiology, 142, 330–340. Pouillot, R., Hoelzer, K., Chen, Y., & Dennis, S. (2013). Estimating probability distributions of bacterial concentrations in food based on data generated using the most probable number (MPN) method for use in risk assessment. Food Control, 29, 350–357. Prencipe, V. A., Rizzi, V., Acciari, V., Iannetti, L., Giovannini, A., Serraino, A., ... Migliorati, G. (2012). Listeria monocytogenes prevalence, contamination levels and strains characterization throughout the Parma ham processing chain. Food Control, 25, 150–158. Reinders, R. D., de Jonge, R., & Evers, E. G. (2003). A statistical method to determine whether micro-organisms are randomly distributed in a food matrix, applied to coliforms and Escherichia coli O157 in minced beef. Food Microbiology, 20, 297–303. Ridout, M., Demetrio, C. G. B., & Hinde, J. (1998). Models for count data with many zeros. Proceedings of the XIX international biometric conference, Cape Town. Ross, T., Rasmussen, S., Fazil, A., Paoli, G., & Sumner, J. (2009). Quantitative risk assessment of Listeria monocytogenes in ready-to-eat meats in Australia. International Journal of Food Microbiology, 131, 128–137. Sauders, B. D., & Wiedmann, M. (2007). Ecology of Listeria species and L. monocytogenes in the natural environment. In E. T. Ryser, & E. H. Marth (Eds.). Listeria, listeriosis, and food safety (pp. 21–53). New York: Marcel Dekker. Schmidt, P. J. (2010). Addressing the uncertainty due to random measurement errors in quantitative analysis of microorganism and discrete particle enumeration data. https://uwspace.uwaterloo.ca/handle/10012/5596. Shorten, P., Pleasants, A., & Soboleva, T. (2008). Estimation of microbiological growth using population measurements subject to a detection limit. International Journal of Food Microbiology, 108, 369–375. Song, X., Pei, X., Xu, H., Yang, D., & Zhu, J. (2015). Risk ranking of Listeria monocytogenes contaminated ready-to-eat foods at retail for sensitive population in China. Chinese Journal of Food Hygiene, 27, 447–450. Sun, W., Jin, Y., Dai, Y., Xiao, J., Wang, X., & Dong, Q. (2019). Application of zeroinflated models in quantitative exposure assessment of Listeria monocytogenes in bulk cooked meat. Food Science, 40(11), 49–54. Tirloni, E., Stella, S., de Knegt, L. V., Gandolfi, G., Bernardi, C., & Nauta, M. J. (2018). A quantitative microbial risk assessment model for Listeria monocytogenes in RTE sandwiches. Microbial Risk Analysis, 9, 11–21. Valero, A., Hernandez, M., Cesare, A. D., Manfreda, G., García-Gimeno, R. M., GonzálezGarcía, P., et al. (2014). Probabilistic approach for determining Salmonella spp. and L. monocytogenes concentration in pork meat from presence/absence microbiological data. International Journal of Food Microbiology, 184, 60–63. Wang, F., & Hailemariam, S. S. (2018). Sampling plans for the zero-inflated Poisson distribution in the food industry. Food Control, 85, 359–368. WHO/FAO (2004). Risk assessment of Listeria monocytogenes in ready to eat foods technical report. Microbiological Risk Assessment Series, No, 5. Williams, M. S., & Ebel, E. D. (2012). Methods for fitting a parametric probability distribution to most probable number data. International Journal of Food Microbiology, 157, 251–258. Wu, Y. N., Liu, P., & Chen, J. S. (2018). Food safety risk assessment in China: Past, present and future. Food Control, 90, 212–221. Yang, S., Pei, S., Yang, D., Zhang, H., Chen, Q., Chui, H., et al. (2018). Microbial contamination in bulk ready-to-eat meat products of China in 2016. Food Control, 91, 113–122. Zhang, G., Wang, X., Liang, Y., & Liu, J. (2010). Fast and robust spectrum sensing via Kolmogorov-Smirnov test. IEEE Transactions on Communications, 58(12), 3410–3416. Zhang, W., Wang, X., Xu, C., Chen, Y., Sun, W., Liu, Q., et al. (2018). Modeling inhibition effects of Lactobacillus plantarum subsp. plantarum CICC 6257 on growth of Listeria monocytogenes in ground pork stored at CO2-rich atmospheres. LWT-Food Science and Technology, 97, 811–817. Zwietering, M. H. (2015). Risk assessment and risk management for safe foods: Assessment needs inclusion of variability and uncertainty, management needs discrete decisions. International Journal of Food Microbiology, 213, 118–123.

Beyond QMRA: Modelling microbial health risk as a complex system using Bayesian networks. Environment International, 80, 8–18. Busschaert, P., Geeraerd, A. H., Uyttendaele, M., & Van Impe, J. F. (2010). Estimating distributions out of qualitative and (semi)quantitative microbiological contamination data for use in risk assessment. International Journal of Food Microbiology, 138, 260–269. Campagnollo, F. B., Gonzales-Barron, U., Cadavez, V. A. P., Sant’ana, A. S., & Schaffner, D. W. (2018). Quantitative risk assessment of Listeria monocytogenes in traditional Minas cheeses: The cases of artisanal semi-hard and fresh soft cheeses. Food Control, 92, 370–379. Carpentier, B., & Cerf, O. (2011). Review - Persistence of Listeria monocytogenes in food industry equipment and premises. International Journal of Food Microbiology, 145, 1–8. Chebon, S., Faes, C., Cools, F., & Geys, H. (2017). Models for zero-inflated, correlated count data with extra heterogeneity: When is it too complex? Statistics in Medicine, 36, 345–361. Chen, M., Wu, Q., Zhang, J., Yan, Z., & Wang, J. (2014). Prevalence and characterization of Listeria monocytogenes isolated from retail-level ready-to-eat foods in South China. Food Control, 38, 1–7. Commeau, N., Parent, E., Delignette-Muller, M., & Cornu, M. (2012). Fitting a lognormal distribution to enumeration and absence/presence data. International Journal of Food Microbiology, 155, 146–152. Crépe, A., Albert, I., Dervin, C., & Carlin, F. (2007). Estimation of microbial contamination of food from prevalence and concentration data: Application to Listeria monocytogenes in fresh vegetables. Applied & Environmental Microbiology, 73, 250–258. Ding, T., Yu, Y. Y., Schaffner, D. W., Chen, S. G., & Ye, Z. Q. (2016). Farm to consumption risk assessment for Staphylococcus aureus and staphylococcal enterotoxins in fluid milk in China. Food Control, 59, 636–643. Duarte, A. S. R., Stockmarr, A., & Nauta, M. J. (2015). Impact of microbial count distributions on human health risk estimates. International Journal of Food Microbiology, 196, 40–50. EFSA & ECDC (2017). The European Union summary report on trends and sources of zoonoses, zoonotic agents and food-borne outbreaks in 2016. ERA Journal, 15, 5077. EFSA (2018). Listeria monocytogenes contamination of ready-to-eat foods and the risk for human health in the EU. ERA Journal, 16, 5134. El-Basyouny, K., & Sayed, T. (2009). Collision prediction models using multivariate Poisson-lognormal regression. Accident Analysis & Prevention, 41, 820–828. Farber, J., Pagotto, F., & Scherf, C. (2007). Incidence and behavior of Listeria monocytogenes in meat products. In E. T. Ryser, & E. H. Marth (Eds.). Listeria, Listeriosis, and Food Safety (pp. 503–570). New York: Marcel Dekker. Gay, J. R., & Korre, A. (2009). Accounting for pH heterogeneity and variability in modelling human health risks from cadmium in contaminated land. Science of the Total Environment, 407, 4231–4237. Gkogka, E., Reij, M. W., Gorris, L. G. M., & Zwietering, M. H. (2013). The application of the Appropriate Level of Protection (ALOP) and Food Safety Objective (FSO) concepts in food safety management, using Listeria monocytogenes in deli meats as a case study. Food Control, 29, 382–393. Gómez, D., Iguácel, L. P., Rota, M. C., Carramiñana, J. J., Ariño, A., & Yangüela, J. (2015). Occurrence of Listeria monocytogenes in ready-to-eat meat products and meat processing plants in Spain. Foods, 4, 271–282. Gonzales-Barron, U., & Butler, F. (2011). A comparison between the discrete PoissonGamma and Poisson-Lognormal distributions to characterise microbial counts in foods. Food Control, 22, 1279–1286. Gonzales-Barron, U., Cadavez, V., & Butler, F. (2014). Conducting inferential statistics for low microbial counts in foods using the Poisson-gamma regression. Food Control, 37, 385–394. Gonzales-Barron, U., Kerr, M., Sheridan, J. J., & Butler, F. (2010). Count data distributions and their zero-modified equivalents as a framework for modelling microbial data with a relatively high occurrence of zero counts. International Journal of Food Microbiology, 136, 268–277. Hassani, H., & Silva, E. S. (2015). A Kolmogorov-Smirnov based test for comparing the predictive accuracy of two sets of forecasts. Econometrics, 3, 590–609. Jarvis, B. (2000). Sampling for microbiological analysis. In B. M. Lund, T. C. Baird-Parker, & G. W. Gould (Eds.). The microbiological safety and quality of food (pp. 1727–1728). Maryland: Aspen Publishers. Jongenburger, I., Bassett, J., Jackson, T., Gorris, L. G. M., Jewell, K., & Zwietering, M. H. (2012). Impact of microbial distributions on food safety II. Quantifying impacts on public health and sampling. Food Control, 26, 546–554. Jongenburger, I., Bassett, J., Jackson, T., Zwietering, M. H., & Jewell, K. (2012). Impact of microbial distributions on food safety I. Factors influencing microbial distributions and modelling aspects. Food Control, 26, 601–609. Lindqvist, R., & Westöö, A. (2000). Quantitative risk assessment for Listeria monocytogenes in smoked or gravad salmon/rainbow trout in Sweden. International Journal of Food Microbiology, 58, 181–196. Lorimer, M., & Kiermeier, A. (2007). Analysing microbiological data: Tobit or not Tobit? International Journal of Food Microbiology, 116, 313–318. Masago, Y., Katayama, H., Watanabe, T., Haramoto, E., Hashimoto, A., Omura, T., et al. (2006). Quantitative risk assessment of Noroviruses in drinking water based on qualitative data in Japan. Environmental Science & Technology, 40, 7428–7433. Mataragas, M., Zwietering, M. H., Skandamis, P. N., & Drosinos, E. H. (2010). Quantitative microbiological risk assessment as a tool to obtain useful information for risk managers - Specific application to Listeria monocytogenes and ready-to-eat meat products. International Journal of Food Microbiology, 141, 5170–5179. Membré, J., & Boué, G. (2018). Quantitative microbiological risk assessment in food industry: Theory and practical application. Food Research International, 106, 1132–1139. Mian, R., & Paul, S. (2016). Estimation for zero-inflated over-dispersed count data model with missing response. Statistics in Medicine, 35, 5603–5624.

7