Evaluation of parameter estimation methods for ozone disinfection kinetics

Evaluation of parameter estimation methods for ozone disinfection kinetics

Pergamon 0043-1354(94)00153-7 Wat. Res. Vol. 29, No. 2, pp. 679-686, 1995 Copyright © 1995 ElsevierScienceLid Printed in Great Britain. All rights r...

510KB Sizes 0 Downloads 61 Views

Pergamon

0043-1354(94)00153-7

Wat. Res. Vol. 29, No. 2, pp. 679-686, 1995 Copyright © 1995 ElsevierScienceLid Printed in Great Britain. All rights reserved 0043-1354/95 $7.00 + 0.00

EVALUATION OF PARAMETER ESTIMATION METHODS FOR OZONE DISINFECTION KINETICS HONGDE ZHOU and DANIELW. SMITH*~ Department of Civil Engineering, University of Alberta, Edmonton, Alberta, Canada T6G 2G7 (First received June 1993; accepted in revised form May 1994) Abstract--Inactivation data with ozone in a CSTR system were analyzed using error-in-variables method (EVM), forced linearization and classical nonlinear regression. Their relative efficiency was compared statistically. The Monte Carlo simulations demonstrated that the EVM produces parameter estimates with the least bias and minimum variance of residuals from the regression. Although used traditionally, the forced linearization would give the biased parameter estimates of kinetics and lead to the potential of misinterpreting the type of inactivation. Requiring the ad hoc assumption of first order decay rate, the classical nonlinear regression is of little use in practice. Fitting the kinetic model to experimental data shows that the behavior of the bacteria and virus inactivation can be well described by predictions from EVM estimates. The resultant coefficient of dilution (n), defined as equation (1) was found to be much higher than one, indicating that the ozone concentration is a dominant factor over the contact time in determining the efficiency of ozone disinfection processes. Key words--disinfection, kinetics, Monte Carlo simulation, ozone, parameter estimation, statistical analysis, water treatment

NOMENCLATURE /3 = C= i= k I, k2 = N= n= S= = S*= r = t=

vector of parameters to be estimated concentration of dissolved ozone (mg/l) index of observations empirical rate constants (mg/1)-n s-i density of surviving microorganisms (CFU/I) empirical coefficient of dilution microbial survival ratio (the relative density of viable organisms at time t to the initial density of organisms) estimate of microorganism density (CFU/I) "true" density of microorganisms (CFU/1) mean hydraulic detention time (s) reaction time in a batch reactor (s) 1. INTRODUCTION

Development of disinfection kinetics for the predictions of inactivation efficiencies and for the design of ozone contactors is traditionally based on observations obtained from batch experiments. It is a c o m m o n practice in the analysis of such observations either to: (1) assume the concentration of disinfectant constant, or (2) consider it to be timedependent with a known dissipation rate. The sequential linear regression, nonlinear least squares regression, or the maximum likelihood method is then employed to evaluate the suitability of different kinetic models and produce the "best" kinetic parameter estimates (Haas, 1988; Venczel et al., 1992).

Ozone is extremely reactive in water. The rapid decay via both autodecomposition and oxidation, and the powerful biocidal potential in water make it difficult to define mathematically the ozone residual declining rate and collect the initial inactivation data with batch experimentation. To overcome these problems, the inactivation with ozone in a C S T R system was recommended (Zhou and Smith, 1994). It offers two special advantages: (1) maintains the ozone residual and the temperature constant; (2) minimizes the interference with the inactivation from sampling. The expressions deduced for the description of microorganism survival in the C S T R system become complicated. They are often implicit nonlinear algebraic equations. To estimate the kinetic parameters characterizing these equations, the commonly used method is inapplicable. This paper presents a new error-in-variables minimization approach to analyze the inactivation data obtained from a C S T R system. The appropriateness of this approach, as compared to the forced linearization and classical nonlinear regression, was examined statistically. Both the data generated from the Monte Carlo simulation and the experimental observations were used. The results show that the new approach is preferred in terms of producing the less biased estimates and the minimum variance of residuals from regression.

2. THEORY The derivation of kinetic expressions for the inactivation in C S T R systems was presented else-

*To whom all correspondence should be addressed. 679

680

HONGDE ZHOU and DANIELW. SMITH

where (Zhou and Smith, 1994). In a batch reactor, a generalized kinetics was written as: dS

k I C "S

dt

1 + k 2C " t

(i)

Rather, all the variables are classified into either those subject to the measurement error or others free of the measurement error. In this regard, equation (3) can be written as f (log S, C, z, fl )

where S is the survival ratio; C the concentration of ozone residual; n the coefficient of dilution and; k~ and k 2 empirical rate constants. The equation is capable of describing the different types of inactivation behaviors and the dominating role of ozone residual over contact time in the inactivation process. Under steady-state conditions, the C in a CSTR is constant. Equation (1) can be mathematically transformed into a form of the power law

= log S + log (I + kl C " ~ S k2/k,) = 0

where fl is the vector of parameters to be estimated. Noting that with errors present, the experimental observations do not exactly satisfy equation (5). To minimize the interference from these experimental errors on the estimation of parameters, the objective function using the log S as the least squares variable is to be minimized Min Q = M i n ~

dS

- d-~ = kl C " S ° +k2/k,).

(5)

!

(log S, - log 5~,)

(6)

(2) subject to the constraint

Based on the mass balance with respect to the number of viable organisms around the reactor boundary, the degree of inactivation was correlated as: (3)

1 - S = k I C " z S ~l +k2/k,)

where r is the mean hydraulic detention time. To obtain the estimates of the kinetic parameters, three approaches can be adopted: forced linearization, error-in-variables minimization and classical nonlinear regression. M e t h o d 1 (Forced linearization)

This method was proposed by Roy et al. (1981) for the analysis of virus inactivation with ozone. The analysis proceeds by linearizing equation (3) through the logarithm transformation of both sides. After rearrangement, it can be written as

Iog--r

=l°gk~+nl°gC+

1+~

logS. (4)

At a fixed concentration, a log-log plot of (1 - S ) / r vs S will give a straight line. The slope of line is equal to (1 + k 2 / k 1); and the intercept to the sum of log kl and n log C. This overall procedure is repeated at each concentration. The resultant intercepts plotted vs the log of concentrations will again yield another straight line with a slope of n and an intercept of log k~. This forced linearization is straightforward. However, it is not consistent with the real error structures of variables, as a result, yielding the biased estimates. In addition, there is no direct way to assess the "lack of fit" or experimental imprecision of the determination (Haas, 1988). M e t h o d 2 (Error-in-variables m i n i m i z a t i o n )

In this method, no distinction could be made between independent and dependent variables.

i (log S, C, z, fl ) = 0

(7)

where log S is the "true" value of variable log S, and fl a vector of parameter estimates. The solution of simultaneous equations defined by equations (6) and (7) follows the procedures of Valk6 and Vajda (1987). It is an iterative procedure, computed by estimating the values of parameters by incorporating the classical nonlinear regression method, and correcting the "true" values of log S by employing equilibrating techniques. The EVM is free of arbitrary assumptions underlying the forced linearization. It represents more reasonable and consistent method to produce unbiased estimates with minimum variance for implicit models. If the log S is normally distributed, the least squares regression gives the maximum likelihood estimates for the parameters fl and for the variable log S (Bard, 1974). The method also permits statistical assessment of the properties of estimates and the lack of fit to the model. M e t h o d 3 (Classical nonlinear regression).

Assuming that k2 is equal to zero, one can simplify equation (3) as log S = - l o g (1 + k~ C"z).

(8)

Essentially, the equation describes the level of inactivation for the first order decay process (ChickWatson model) in a CSTR system. As the least square variable can be explicitly expressed by the fixed variables, the classical nonlinear regression becomes applicable, with the objective function being the same as equation (6). As a special case of the EVM, the classical nonlinear regression possesses the same merits as the EVM does. With the a d hoc assumption that the inactivation must follow the Chick-Watson model, however, it provides little application in practice.

Ozone disinfection kinetics

681

Table 1. Protocols of Monte Carlo study Case I (C-W)

Case 2 (tailing)

Case 3 (shoulder)

3.0 2.0×104 1.00

3.0 2.0 x 104

3.0 2.0 x 104 0.75

Parameters

n k I (rag/l)-3×s i 1 + k2/k I Inactivation

C (mg/l) r (s)

O.OI 10

1.25

0.025 25

0.010 100

0.25 250

0.50 500

the variates were a d d e d to the log S * ' s to produce the simulated observations. They were generated from an assumption o f normal distribution using the acceptance/rejection technique with a c o m m o n standard deviation o f 0.3. The c o m p u t a t i o n was performed using algorithm R N N O A from I M S L / M A T H library. In total, there are 100 trials for each o f the scenarios investigated. Detailed simulation protocols are listed in Table 1. Fitting the kinetic model to the inactivation data was performed analytically. The least squares regression to obtain the o p t i m u m estimates of kinetic parameters was performed by employing the quasiN e w t o n algorithm from the software package S y s t a (~ . A scalar factor o f 10 was multiplied to the concentration o f ozone to resolve the difficulty in regression o f ill-conditioning. The equilibrating o f equation (7) was performed using the preset spreadsheet to renew the " t r u e " values o f log S. In these two alternating procedures, the tolerance level is 10 5 in either parameters or the " t r u e " log S's.

3. PROCEDURES

The study was carried out in two steps: M o n t e Carlo analysis and case studies. The M o n t e Carlo simulation was first used to c o m p a r e the efficiency o f three methods. Three c o m m o n types o f inactivation kinetics were studied: the first order decay inactivation (Case 1), tailing inactivation (Case 2), and shoulder inactivation (Case 3). As expressed by equation (2), they c o r r e s p o n d to the values o f k 2 / k ~ equal to, larger than, and smaller than zero, respectively. Then, two sets o f experimental data were fit to illustrate the E V M application and obtain appropriate kinetic expressions for the inactivation with ozone. The data o f bacterial inactivation by Z h o u and Smith (1994) and those o f virus inactivation by Roy e t al. (1981) were selected, one with bacteria at 21°C and pH 6.5 (S1), and a n o t h e r with virus at 20°C and p H 7.2 ($2). Both investigations were carried out in the C S T R system and detailed experimentation can be found in the cited references. To do the M o n t e Carlo simulations, the true log S at a given condition was generated from the exact solutions log S * for equation (3). F o r comparison purposes, the c o m m o n values o f coefficient o f dilution n and rate c o n s t a n t k~ were assumed. However, three k 2 / k I values o f 0.0, 0.25 and - 0 . 2 5 , were chosen to reflect the different types o f kinetics. Then,

4. RESULTS .AND DISCUSSION 4.1.

Monte

Carlo

analysis

The results o f parameter estimates from M o n t e Carlo computations are summarized in Table 2. The bias is defined as the difference between the mean

Table 2. Summary of parameter estimate results from Monte Carlo study n Method 1 (forced linearization) Mean 2.46 Bias (%) 18.0 SE 0.69 Variability (%) 28.1 SD 0.88 Wilcoxon test 60.8 Method 2 (EVM) Mean 2.95 Bias (%) 1.67 SE 0.51 Variability (%) 17.3 SD 0.51 Wilcoxon test 9.73 Method 3 (nonlinear regression) Mean 3.01 Bias (%) -0.33 SE 0.13 Variability, % 4.3 SD 0.13 Wilcoxon test 19.8

Case 1 (C-W) logk~ I +k2/k I

n

Case 2 (tailing) logk h 1+k2/k ~

n

Case 3 (shoulder) logk~ I +k2/k I

3.03 0.69 29.6 31.0 1.11 0.12 36.6 17.7 1.69 0.33 P < 0.001

2.22 25.9 0.80 36. I 1.11 72.8

2.47 0.72 42.6 42.2 1.28 0.15 51.9 21. I 2.23 0.15 P < 0.001

2.66 11.2 0.54 20.4 0.64 48.6

3,51 0.60 18,5 20.4 0.89 0.08 25.4 13.9 1,19 0.17 P < 0.001

4.18 0.98 2.81 2.00 1.02 0.17 24.4 17.4 1.02 0.17 P < 0.01

2.97 1.00 0.66 22.2 0.66 9.40

4.21 1.23 2.12 1.60 1.33 0.27 31.6 22.0 1.33 0.27 P < 0.01

2.95 1.67 0.37 12.5 0.38 9.24

4.18 2,81 0,74 17.7 0.75

0.74 1.33 0.09 12.2 0.09 NS

4.30 0.02 0.13 3.0 0.13 P < 0.001

Note: bias = (average - true)/true; SE = standard error; SD = standard deviation; variability= average/SE: P -=significant level: NS = statistically insignificant(P > 0.05).

682

HONGDEZHOUand DANIELW. SMITH

value over all 100 trials and the true value, and divided by the true value. The standard error (SE) is the square root of variance about the mean, while the standard deviation (SD) includes that due to the bias from the true value. The coefficient of variation (CV) is the standard error divided by the mean value. The first comparison between three methods is the bias in estimates. The forced linearization produces the substantially biased n, log kl and (1 + k2/kl ), with a bias of up to 42%. In particular, the k2/k~ values in all cases were biased to be less than zero. In other words, the use of the forced linearization method would prejudice the inactivation as a shoulder effect phenomenon. In contrast, the EVM produced a bias of less than 3%, which should be considered as acceptable in practice. It can also truly reflect the type of inactivation. In all three cases, the ratios of k2/k l are close to the assumed values. For the classical nonlinear regression method, a slightly better estimation could be achieved. However, it was shown that the difference in bias between the EVM and the

classical nonlinear regression method is not statistically significant. Therefore, a comparison between the classical nonlinear regression and the EVM for the Chick-Watson model type of inactivation will not be emphasized further. Figures 1 to 3 show typical plots of correlation between the estimates over all the scenarios. Both the forced linearization and the EVM produce the estimates with a large standard error and standard deviation. As a result, the coefficient of variance is as high as 50 and 30% for the forced linearization and the EVM, respectively. The problem is common for similar mathematical models which inherit the adverse mutual influence of the estimate of one parameter on that of another (Himmelblau, 1970). It results from the high correlation between the estimates. For the classical nonlinear regression, the estimates are much more efficient, with a coefficient of variance of less than 5%. Despite the large variability, the EVM provides an excellent estimate of parameter variances caused by

5.0

2.5

o •-

O0

0

-2.5 X

J

-5.0 -4

i

J

J

0 Error (n)

2

4

I

-2

0.8

0.8

,-,

0.4

0.4

+

0.0

X

×

+

x

X X, ~

x

"2

0

m

X

×

0.0

-0.4

X~ X

-0,4 x

X X

m

-0.8

-4

-2

X ,

0 Error (n)

m

t

2

4

-0.8

l

-5.0

-2.5

,

,

'

0.0 2.5 Error (log k 1)

Fig. 1. Scatter plots of Case 1 estimates using the forced linearization.

!

5.0

Ozone disinfection kinetics

683

5.0 k

2.5

0.0

-2.5

i

-5.0

-2

-4

i

!

0

2

Error (n)

081

08

x

¢q

+

0.0

;-

+

-0.4

0.0

-0.4

-0.8

'

-4

m

'

-2

0 Error (n)

2

,

I

4

-0.8

-5.0

'

I

m I

-2.5 Error

m i

0.0 2.5 flog k 1)

I

5.0

Fig. 2. Scatter plots of Case I estimates using the EVM.

the experimental error. As shown, the standard error is almost identical with the corresponding standard deviation, with a difference of less than 2%. In general, the forced linearization results in a larger difference (up to 200%) between standard error and standard deviation. The multivariate Witcoxon sign test further showed that the EVM produces the lowest deviated parameters from the assumed location. It followed the procedure of Hettmansperger (1983), with the null hypothesis that the median difference between the true and estimated parameters is zero, and that the distribution of difference is symmetric. In the case of shoulder curve, the bias is not distinguishable from the zero at a significant level of 0.05, and in the other two cases the bias falls within the vicinity of critical value. Even at a significant level of 0.001, the forced linearization causes a severe difference. Further statistical tests for validating the EVM are to examine the distribution of residuals from regression. First, residual plots were constructed

against each of variables to examine the randomness of errors and the consistency of residuals with a normal distribution. Figure 4 is a typical representation of this kind. There is a trend of residuals from the forced linearization along with log S, whereas the errors are mostly random for the EVM. Then, the ;(2-test was used to affirm the zero-mean hypothesis about the residuals with the known variance of 0.32 . As shown in Table 3, the EVM produces a mean sum of square of residuals about 10 times less than the forced linearization. Only in the case of the declining rate inactivation does the EVM not cause a rejection of hypothesis. In the other two cases the difference is not evident at a significant level of 0.01. For the forced linearization, the difference is substantial, even at a significant level of 0.001, for all the scenarios under considerations. 4.2. Case studies

Following the same procedures to the Monte Carlo simulations, the two sample data sets were analyzed

684

I

0.4

0.2

2.0

x

÷

X

+

1.5

X x

÷ +

Xx ~xxX~ x

O0

•~,

HONGDE Z H O U a n d DANIEL W . SMITH

x ~ X / ~.

+

+ +

1.0 +

xx

+

+4 +

"~ 0.5 "5

0.0

0 0

0

0

0.0

g.l

o

0.0

0.2

-6

0.4

49 0

0

0

~0~

+ o $

by the forced linearization and the EVM. Figures 5 and 6 present a comparison of the predicted to the experimental observations. The solid lines represent predictions calculated using the best parameter estimates from the EVM. An excellent conformity can be seen between the observed and model solution. Table 4 summarizes the estimated parameters for the two samples. Their variance and covariance matrix were evaluated according to Bard (1974) for the EVM. Since it is unreasonable to make the assumption that the inactivation must follow the first order decay process, the analysis with the classical nonlinear regression was not performed. The results indicate that the estimates of model parameters from these two methods are substantially different. Using the EVM estimates as the reference values, the forced linearization produces significant biases in estimates. The bias is so large that the value of k 2 / k I would signify the incorrect type of inactivation. For example, Method 1 indicates that the inactivation in the two cases studies appeared to be the type of the shoulder curves. In contrast, the results of method 2 clearly indicates that the inactivation of E . c o l i is closely described by the Chick-Watson model ( k 2 / k l = 0). For both methods, the variance/covariance matrix shows that the correlations between parameter estimates are high. As a result, the estimates are associated with

Case 2

37

Case 3

33

oo

÷+

I

1

I

I

I

1

I

-5

-4

-3

-2

-1

0

1

wide confidence intervals, suggesting that there is a large uncertainty in estimating the individual parameters. From the above analysis, it is interesting to note that the inactivation for both virus and bacteria with ozone is associated with a high coefficient of dilution (n). Though the exact explanations can not be currently elucidated, its significance with respect to the optimization of ozone contactors and the stipulation of ozonation criteria is obvious. The ozone inactivation would be more preferable at the high concentration of ozone residual with a short contact time. The current CT concept, mostly derived from the traditional disinfectants such as chlorine and its derivatives, may not be valid for the ozonation. The same data for poliovirus inactivation ($2) were analyzed by Roy e t al. (1981), using either the empirical relationship C"z99 = K

(9)

or Method 1 presented herein. The former approach gave the coefficient of dilution n equal to 3.4 at 99% inactivation, while the latter concluded the rate equation as dN dt

- --

= kCN

0'69.

Table 3. Statistical test on M o n t e Carlo residuals

34

+-:

Fig. 4. A typical plot of residuals from the regression vs log S.

Fig. 3. Scatter plot of Case 1 estimates using the classical nonlinear regression.

Case I

~ +

Log S

E r r o r (n)

SSR

0

-

-l.0

-0.2

+ OO

0

-0.5

-0.4

+

0

0

0

-0.2

-0.4

+

Method 1

O Method 2

Method 1 Z 2"test 379 P < 0.001 411 P < 0.001 365 P < 0.001

SSR 3.5 3.6 3.6

Method 2 X 2"test 39 P < 0.01 40 P < 0.01 40 P < 0.01

Method 3 SSR Z -'-test 2.1

23 NS

Note: SSR = sum o f squares o f rcsiduals; P = significant level; N S = statistically insignificant ( P > 0.05).

(10)

685

Ozone disinfection kinetics 0

-1

0 -

--

0

k a~

i

20s

,, 60 s

O

O 0.1 mg/I

A 0.12 mg/I

-1 " ~ k

-2 O ,-.1

~-~\

t:2 0.15mg/1

x

-3 .4-

-5 Xy

-6



I

L

I



I

J

0.1

0.2

0.3

0.4

0.5

0.6

C (mg/l)

-4 /

Fig. 5. Comparison of the EVM predictions and observations for Sample 1 at different contact times.

1

I

I

I

I

I

I

20

40

60

80

100

120

140

Fig. 6. Comparison of the EVM predictions and observations for Sample 2 at different ozone residuals. The different conclusions drawn from the same data signal the importance of parameter estimation methods for the ozone disinfection kinetic analysis. As usual, the use of the improper forced linearization method led to the reduction in " n " value and suggested the inactivation as the shoulder type. Besides, in their calculations, several critical points at the initial state of inactivation were not included, possibly to avoid the occurrence of nonlinearity of a plot of log(N o - N ) / z versus log N and inconsistency of slopes for each of the ozone residuals. As a result, they obtained the kinetic parameters different from those listed in Table 4.

5. SUMMARY

Because of the high reactivity of ozone and its strong biocidal potential, it is advantageous to obtain the reliable data of inactivation from the CSTR

system in order to formulate the kinetics of ozone disinfection. Adequate analysis of these data was achieved using the EVM approach. It produces parameter estimates with the least bias and minimum variance of residuals from the regression. Although traditionally used, the forced linearization method produces biased estimates of model parameters. The bias is so severe that the results could signify the incorrect type of inactivation. Requiring the a d hoc assumption of first order decay rate, the classical nonlinear regression is of little use in practice. The analysis of two sample data sets demonstrated that the EVM predicts the inactivation in excellent conformity with the observations. It was found that the coefficient of dilution for the inactivation of bacteria and virus with ozone is over 2.4, indicating the ozone residual is a dominant factor over the contact time to determine the efficiency of ozone

Table 4. Results of parameter estimation for sample data sets Method I (forced linearization) Log k] n I + k2/k I Inactivation o f Bacteria with ozone Estimate 2.84 2.49 Variance/covariance matrix Log k~ 0.34 n 0.17 0.09 1 + k2/k I 0.05 0.03 SE 0.58 0.30 CV (%) 20.4 12.0 Bias (%) -34.3 -24.3

0.74

0.001 0.09 12.2 -26.0

Log k I

Method 2 ( E V M ) n I + k2/k t

4.32

3.29

0.99

0.06 0.03 0.01 0.24 5.6

0.02 0.005 0.13 4.0

0.002 0.04 4.0

0.680

2.44

0.859

0.004 0.0005 0.060 2.5

0.0002 0.013 1.5

Inactivation o f Virus with ozone Estimate

1.52

2.02

Variance/covariance matrix Log k~ 0.44 n 0.36 0.33 I + k2/k r 0.06 0.04 SE 0.66 0.58 CV (%) 43 29 Bias (%) 123 -17

0.680

0.01 0.11 17 -21

0.005 0.004 0.0008 0.072 II

686

HONGDEZHOUand DANIELW. SMITH

disinfection. This confirms that the disinfection with ozone will be more efficient at the higher ozone residual within a shorter contact time.

REFERENCES

Bard Y. (1974) Nonlinear Parameter Estimation. Academic Press, New York. Haas C. N. (1988) Maximum likelihood analysis of disinfection kinetics. Wat. Res. 22, 669--677. Hettmansperger T. P. (1983) Multivariate location tests. In Encyclopedia of Statistical Sciences (Edited by Kotz S. and Johnson S. L.). Wiley, New York.

Himmelblau, D. M. (1970) Process Analysis by Statistical Methods. Wiley, New York. Roy D., Chian E. S. K. and Engelbrecht R. S. (1981) Kinetics of enteroviral inactivation by ozone. J. Envir. Engng ASCE 107, 887-901. Valk6 P. and Vadja S. (1987) An extended Marquardt-type procedure for fitting error-in-variables models. Comp. Chem. Engng 11, 37--43. Venczel L., Sobsey M. D. and Crawford-Brown D. (1992) The inactivation kinetics of monochloramine on monodispersed hepatitis A virus and MS2. Proc. Annual A WWA Conf." Water Quality American Water Works Association, Vancouver, B. C., Canada pp. 531-543. Zhou H. and Smith D. W. (1994) Kinetics of ozone disinfection in a CSTR system. J. Envir. Engng, ASCE. 120, 4, 841-858.