Disinfection by-product formation following chlorination of drinking water: Artificial neural network models and changes in speciation with treatment

Science of the Total Environment 408 (2010) 4202–4210 Contents lists available at ScienceDirect Science of the Total Environment j o u r n a l h o m...

Download PDF

1MB Sizes 0 Downloads 24 Views

Report

PDF Reader
Full Text

Science of the Total Environment 408 (2010) 4202–4210

Contents lists available at ScienceDirect

Science of the Total Environment j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / s c i t o t e n v

Disinfection by-product formation following chlorination of drinking water: Artiﬁcial neural network models and changes in speciation with treatment Pranav Kulkarni a,1, Shankararaman Chellam a,b,⁎,2 a b

Department of Civil and Environmental Engineering, University of Houston, Houston, TX 77204-4003, United States Department of Chemical and Biomolecular Engineering, University of Houston, Houston, TX 77204-4004, United States

a r t i c l e

i n f o

Article history: Received 17 February 2010 Received in revised form 21 May 2010 Accepted 29 May 2010 Available online 26 June 2010 Keywords: Artiﬁcial neural networks Disinfection by-products Drinking water Trihalomethanes Haloacetic acids Total organic halide Chlorine disinfection Nanoﬁltration Granular activated carbon

a b s t r a c t Artiﬁcial neural network (ANN) models were developed to predict disinfection by-product (DBP) formation during municipal drinking water treatment using the Information Collection Rule Treatment Studies database complied by the United States Environmental Protection Agency. The formation of trihalomethanes (THMs), haloacetic acids (HAAs), and total organic halide (TOX) upon chlorination of untreated water, and after conventional treatment, granular activated carbon treatment, and nanoﬁltration were quantiﬁed using ANNs. Highly accurate predictions of DBP concentrations were possible using physically meaningful water quality parameters as ANN inputs including dissolved organic carbon (DOC) concentration, ultraviolet absorbance at 254 nm and one cm path length (UV254), bromide ion concentration (Br−), chlorine dose, chlorination pH, contact time, and reaction temperature. This highlights the ability of ANNs to closely capture the highly complex and non-linear relationships underlying DBP formation. Accurate simulations suggest the potential use of ANNs for process control and optimization, comparison of treatment alternatives for DBP control prior to piloting, and even to reduce the number of experiments to evaluate water quality variations when operating conditions are changed. Changes in THM and HAA speciation and bromine substitution patterns following treatment are also discussed. © 2010 Elsevier B.V. All rights reserved.

1. Introduction In addition to inactivating microorganisms, chemical disinfectants such as chlorine also react with natural organic matter (NOM) and bromide ion in water to form numerous disinfection by-products (DBPs), which have been implicated as human mutagens, carcinogens, and teratogens (Hamidin et al., 2008). Trihalomethanes (THMs) and haloacetic acids (HAAs) constitute the major halogenated DBPs currently regulated in drinking water, accounting for approximately half the total organic halide (TOX) concentration. Since THMs and HAAs are not typically present in the source water but are by-products formed during chlorination as an unintended consequence, they are most often controlled by reducing the concentrations of their precursors (particularly NOM) prior to adding chlorine. In this manuscript, we consider three important water treatment processes employed for NOM (and DBP precursor) removal from

⁎ Corresponding author. Department of Civil and Environmental Engineering, University of Houston, Houston, TX 77204-4003, United States. Tel.: +1 713 743 4265; fax: +1 713 743 4260. E-mail address: [email protected] (S. Chellam). 1 Present address: Trinity Consultants, Houston, TX, United States. 2 Originally submitted to Science of the Total Environment on February 17, 2010. Revised version submitted on May 23, 2010. 0048-9697/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.scitotenv.2010.05.040

drinking water sources; (i) conventional treatment (coagulation– ﬂocculation–sedimentation–media ﬁltration), (ii) granular activated carbon (GAC) adsorption, and (iii) nanoﬁltration (NF) (Chen et al., 2007). We are particularly interested in the formation of THMs, HAAs, and TOX following these processes using free chlorine as the disinfectant. It should be emphasized that even though current regulations do not include TOX, good treatment practices necessitate its control as well. It is difﬁcult to derive purely mechanistic models of DBP formation in natural waters due to the inherent heterogeneity of NOM, the complex background chemistry of municipal water supplies, and large variations in water quality of surface water supplies with season and location in terms of NOM concentrations, origin, and characteristics. Additionally, since the removal of speciﬁc NOM components depends on the treatment processes employed (e.g. coagulation and GAC preferentially remove hydrophobic portions and NF preferentially removes higher molecular weight portions) the DBP yield is changed upon treatment further complicating the chemistry and prediction of DBP formation. Hence, DBP mass concentrations [DBP] are typically modeled empirically by linearly regressing each of the water quality parameters inﬂuencing DBP formation including dissolved organic carbon concentration (DOC), ultraviolet absorbance at 254 nm and one cm path length (UV254), bromide ion concentration (Br−), chlorine dose (Cl2), chlorination pH (pH), contact time (Time), and

P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210

reaction temperature (Temp) speciﬁc to each water supply and treatment process: a

b

½DBP = k × DOC × UV254 × Br

−c

d

e

f

g

× Cl2 × pH × Time × Temp

ð1Þ

where k, a, b, c, d, e, f, and g are empirical constants. Log-linear power functions similar to Eq. (1) are extensively employed to model THM and HAA formation e.g. (Chowdhury et al., 2009; Hong et al., 2007; Sadiq and Rodriguez, 2004a; Sohn et al., 2004; Uyak et al., 2007; Westerhoff et al., 2000) even though they are best suited only to predict central tendencies of databases used to develop them. In contrast, artiﬁcial neural networks (ANNs) have the capability to approximate any function (and its derivatives) to any degree of accuracy. Also, the superior ability of ANNs to handle noisy, distorted multivariate data makes them a more powerful modeling tool compared with regression models such as Eq. (1). Paradoxically, even though ANNs have the potential to better predict DBP formation compared to multivariate regression models, only a very limited number of studies have considered ANNs for DBP formation upon chlorination. Rodriguez and co-workers (Milot et al., 2002; Rodriguez et al., 2003) focused exclusively on THMs (HAAs and TOX were excluded in these studies) and also evaluated DBPs' health risks using a fuzzy logic models (Sadiq and Rodriguez, 2004b). Additional research is necessary to speciﬁcally evaluate the capability of ANNs to predict DBP formation following many water treatment processes implemented for DBP control. The principal objective of this research is to derive accurate ANN models for THM, HAA, and TOX formation following chlorination of raw and treated (conventional treatment, GAC, and NF) waters. We also provide additional experimental data on changes in THM and HAA speciation focusing on variations in bromine substitution with treatment. ANNs were implemented and validated using a large dataset that includes an extensive set of bench-scale, pilot-scale, and full-scale experiments from numerous water treatment plants located in the United States (Allgeier et al., 1998). Network connection weights are also interpreted quantitatively to develop more insights into the relative importance of individual physicochemical factors known to inﬂuence DBP formation and speciation. 2. Neural networks A commercially available software program (Matlab neural network toolbox 6.1, The Math Works, Inc., Natick, MA) was used to implement ANNs on a personal computer. In this study, feed forward, back propagation ANNs consisting of an input layer, one or two hidden layers, and an output layer were developed. To minimize network complexity and improve its performance, the least number of physically meaningful input parameters was employed; viz. DOC concentration, UV254, Br− concentration and chlorination conditions including Cl2 dose, contact time, pH, and temperature, which are the same parameters in Eq. (1). The output consisted of a single neuron representing TOX, THM or HAA concentrations. As commonly practiced, the number of hidden layers and the number of hidden neurons were determined iteratively using trial and error. In this study, one or two hidden layers consisting of 4–8 neurons were found to be satisfactory for all simulations. To improve the efﬁciency of batch training, the Levenberg–Marquardt algorithm with optimum learning rate between 0.01 and 0.0001 was chosen through experimentation to avoid instability and excessive convergence time. All synaptic weights were initialized randomly in the range (−0.5, +0.5) and accordingly readjusted (via back propagation) to reduce the difference between actual and desired output in terms of sum of squared error (SSE); ntrain 2 SSE = ∑ ½DBP obs −½DBP pred i=1

ð2Þ

4203

where [DBP]obs is the experimental or observed DBP concentration and [DBP]pred is the corresponding ANN output or prediction and ntrain is the number of observations employed for ANN training. For each simulation, the network was trained iteratively until SSE b10− 5 or the maximum gradient was reached. As is usually necessary, data were normalized by the corresponding maximum value to put them in the range 0–1. Training data were carefully chosen so that they had a greater descriptive ability while simultaneously making an effort to use a minimum number of observations. Because ANNs are better suited to interpolate rather than extrapolate, the maximum and minimum values of each input parameter were always chosen to train the network leaving intermediate measurements for validation. The optimum network architecture for each type of treatment and untreated water was obtained by multiple runs. Network validation was performed by providing only those input values that were not included in the original training set. The quality of DBP predictions in comparison to the desired outputs in the validation dataset was evaluated both in terms of the regression coefﬁcient and its N25 value, which represents percentage of predictions that have less than 25% absolute relative error calculated as:

Absolute Relative Error =

absð½DBPpred − ½DBPobs Þ ½DBPobs

:

ð3Þ

The contributions of each input variable to predict DBP concentrations were determined by the Garson weight partitioning method using the absolute values of the neuronal connection weights (Garson, 1991; Goh, 1994): nH

2

∑4

j=1

Relative importance of input variable v = nv

2

3 ivj nv

∑ ikj

nH

0

∑ 4∑ @

i=1

Oj 5

k=1

j=1

ivj nv

∑ ikj

13

ð4Þ

Oj A5

k=1

where, nv and nH are the number of input and hidden neurons respectively, ij and Oj denote the absolute value of connection weights between input to hidden layer and hidden to output layer, respectively. Eq. (4) is best suited to interpret the trends in relative importance of input variables rather than the calculated absolute values. 2.1. Experimental dataset All data used in this manuscript are available in the ICR Treatment Study Database developed by the United States Environmental Protection Agency using bench-, pilot-, and full-scale DBP precursor removal data provided by public water systems meeting certain criteria (Allgeier et al., 1998). GAC and NF studies were performed encompassing seasonal variations in ground- and surface waters representing numerous full-scale water treatment plants across the United States (Allgeier and Summers, 1995; Bond and DiGiano, 2004). It should be emphasized that under the ICR Treatment Study requirements, DBP precursor removal was evaluated using simulated distribution system (SDS) testing (Koch et al., 1991). Limited available data suggests a greater complexity in predicting DBPs from actual real-world distribution systems compared with studies employing SDS testing e.g. (Platikanov et al., 2007; Shimazu et al., 2005). This is because distribution systems have several nonidealities (e.g. dead-zones, potential presence of bioﬁlms, different chlorine decay kinetics caused by pipe wall roughness, a distribution of residence times, non-uniform disinfectant concentrations, etc.) that cannot be accurately simulated in a simple SDS test run in batch mode.

4204

P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210

In any case, all DBP data used in this research were generated through SDS testing under the ICR Treatment Study requirement (rather than actual measurements from various locations within an actual distribution system). Qualifying municipalities evaluated GAC and NF to provide technical and economic data to assess the feasibility of these technologies for precursor control in anticipation of more stringent DBP regulations. SDS testing allowed municipalities to evaluate various design parameters for these advanced technologies under site speciﬁc conditions without sending the treated water to their customers through the distribution system. In other words, the purpose of ICR Treatment Studies was not to measure DBP concentrations in full-scale distribution systems following existing treatment techniques. Rather, its purpose was to determine the extent of DBP control achieved by GAC and NF for a geographically diverse set of water supplies so that they can be potentially installed for largescale water treatment in the future. A separate 18-month DBP monitoring requirement was also required separately under the ICR, wherein water samples were collected at numerous locations within full-scale water treatment plants including the distribution system (McGuire et al., 2002; Obolensky and Singer, 2005). Multiple linear regression models incorporating important water chemistry parameters have been recently developed for DBP formation using these data (Obolensky and Singer, 2008). A potential research opportunity is to use this monitoring (as opposed to the Treatment Study) database to compare the relative accuracy of regression and ANNs to predict DBP concentrations in real-world distribution systems. Detailed information on pretreatment, unit processes, water quality, etc. is available in the ICR database (EPA, 2000). In addition to the extensive data obtained, the ICR imposed stringent quality control requirements for the conduct of treatment studies as well as analytical methods for water quality and DBP measurements, making it a very high quality database ideally suited to develop DBP formation models. A summary of the database with the range of each of the seven ANN input parameters (DOC, UV254, Br−, chlorination conditions including Cl2 dose, contact time, pH, and temperature) is given in Table 1. The Environmental Protection Agency established Minimum Reporting Levels (MRLs) separate from method detection limits for each water quality parameter corresponding to laboratories' ability to measure analytes with a predetermined accuracy and precision. The MRLs speciﬁed under the ICR have been reported elsewhere (Chellam and Taylor, 2001; EPA, 1996). If any of the seven ANN input values was below the minimum reporting level (BMRL), that entire set of readings was excluded whereas output values (DBP concentrations) with BMRL values were assigned a value of half the MRL. Datasets wherein ammonia was detected were excluded in this research because only DBP formation with free chlorine is considered in this manuscript. Based on the water treatment unit processes used and available data, the entire database was divided in 4 subsets, viz. untreated water (5 datasets), conventional treatment (30 datasets), water treated by GAC (57 datasets), and NF (26 datasets). Each benchscale study includes four seasonal THMs, HAAs and TOX measurements at different operating conditions along with related operational and water quality parameters such as sampling time, operation time, pH, turbidity, DOC, UV254, Br−, and simulated distribution system conditions using free chlorine. Bench-scale experiments were per-

formed either with a ﬂat sheet of a commercially available nanoﬁltration membrane (Allgeier and Summers, 1995) or using the rapid small scale column test protocol to evaluate GAC treatment (Bond and DiGiano, 2004). Simulated distribution system tests (Koch et al., 1991) were also performed in existing full-scale municipal water treatment plants that employed NF or GAC. (Note that even fullscale treatment facilities were not required to sample from their existing distribution system but to report SDS testing results.) In summary, a large number of measurements spanning a wide range of inﬂuent water quality parameters from surface- and ground-waters, and from unit processes with varying design variables, pretreatments, and operating conditions have been used to model the complex relationships that exist between DBP concentrations and various factors responsible for their formation. 3. Results and discussion 3.1. Need to derive separate ANNs for each treatment technique Principal Component Analysis (PCA) was employed to determine if a single neural network was capable of predicting DBP concentrations formed in the four waters (raw, conventional treatment, GAC, and NF) or whether four separate ANNs were necessary, one for each water type (Massart et al., 1997). Speciﬁcally, PCA was carried out to assess the associations between the contributions of raw water and the three DBP precursor removal techniques (conventional treatment, GAC, and NF) in THM, HAA, and TOX formation. PCA also reduced the number of independent variables by generating a new coordinate system of uncorrelated variables called principal components. One unique feature of PCA is that variables with similar properties group together separating out the variables with dissimilar properties. A data subset in which all input variables except DOC varied only in a narrow range for untreated water and the other 3 unit processes was initially identiﬁed. All DBP concentrations in this subset were ﬁrst normalized by the corresponding DOC concentrations to calculate the DBP yield. Next, PCA was performed using singular value decomposition. Fig. 1 depicts the PCA plot in which each symbol represents these normalized concentrations of TOX, THM4 (sum of the four THM species, viz. CHCl3, CHClBr2, CHCl2Br, and CHBr3), and HAA6 (sum of six HAA species, viz. CH2ClCOOH, CHCl2COOH, CCl3COOH, CH2BrCOOH, CHBr2COOH, and CHClBrCOOH) from the subset. (Note that during the development of the ICR database stable analytical standards for three other HAAs containing chlorine and bromine viz. CCl2BrCOOH, CClBr2COOH, and CBr3COOH were not widely available commercially.) Fig. 1 shows the distinct separation of each of the four symbols, even appearing in different quadrants, implying large variations in the extent of DBP formation in raw water and after conventional treatment, NF, and GAC under similar chlorination conditions. In mechanistic terms, PCA demonstrates that DBP yields and formation mechanisms depend on the type of precursor removal method employed. Hence, individual ANN models for each of the treatment process and raw water were necessary to capture the differing underlying aqueous chemistries and NOM characteristics that resulted in varying DBP yields upon chlorination. These results are consistent with the practice of deriving separate regression models for different unit processes (Chowdhury et al., 2009; Legube et al., 2004; Sadiq and Rodriguez, 2004a; Sohn et al.,

Table 1 Summary of ANN input parameters for the treatment studies employed in this manuscript. Treatment

TOC (mg/L)

UV254 (cm− 1)

Br− (μg/L)

Cl2 dose (mg/L)

Temp. (°C)

pH

Contact time (h)

Untreated water Conventional treatment GAC treatment Nanoﬁltration

1.72–14.40 0.90–5.86 0.05–8.25 0.08–4.80

0.047–0.673 0.021–0.161 0.001–0.225 0.001–0.124

37–510 20–1850 20–1810 6–670

3.70–20.0 0.76–14.5 0.67–9.5 1.5–19.1

20.0–28.8 4.0–30.2 1.0–33.0 5.4–28.0

7.7–9.1 5.9–10.1 5.9–10.4 5.5–9.5

6.3–83 1.9–120 1.8–120.0 6.0–72.0

P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210

4205

Fig. 1. Principal Component Analysis to determine the need for separate ANNs for individual water types. Raw, Conv., GAC, and NF denote raw water, conventionally treated water, GAC efﬂuent, and NF permeate respectively.

2004) since the nature of DBP precursors (and consequently the yield and kinetics) are different in raw water and after various treatments. For example, coagulation is known to preferentially remove the fraction of NOM that is more hydrophobic, of higher molecular weight, and that has more binding sites (Singer, 1999). NF removes the higher molecular weight fractions, changes the speciﬁc ultraviolet absorbance (SUVA) and the Br−/DOC ratio between the feed and permeate waters, (Chellam and Krasner, 2001). Similarly, precursor removal by GAC is a function of pore size distribution, NOM molecular weight distribution and heterogeneity (Singer, 1999). In other words, separate ANNs were necessary since NOM reactivity towards chlorine and DBP yield is changed based on the treatment process employed. Therefore, separate neural networks were developed to predict DBP formation in raw water, as well as after conventional treatment, GAC adsorption, and NF. The network conﬁguration and parameters such as learning rate, number of hidden layers, number of neurons in each layer, initial weights, etc. were varied for each simulation prior to predictions to obtain the most reliable ANN model in each case. The optimal neural network architecture that gave best N25 values for each water type was determined using this procedure. 3.2. ANN predictions of DBP concentrations Figs. 2, 3, 4, and 5 depict comparisons of ANN predictions of TOX, THM4, and HAA6 concentrations with experimental observations in untreated water, and waters puriﬁed by conventional treatment, GAC adsorption, and NF respectively. Measurements from bench-, pilot-, and full-scale treatment processes are all shown. In each case, the number of data points used for training (Ntrain), validation (Ntest),

percentage of predictions within 25% absolute relative error (N25), and the regression coefﬁcient (R2) is also reported. Note that to be more stringent, only experimental measurements employed for validating the neural network model are shown. Training datasets are not depicted since they were extremely well predicted by ANN models and were superposed directly on the line of perfect agreement. A summary of the number of points used for ANN training and validation along with the N25 values and regression coefﬁcients for individual DBPs are given in Table 2. As observed in Table 2 and Figs. 2, 3, 4, and 5 neural networks gave consistently high N25 values (77– 98%) and high regression coefﬁcients (0.78–0.98) even when using only 7–22% data for training for each water type. Good THM, HAA, and TOX predictions using ANNs agree with earlier reports for THMs (Rodriguez et al., 2003) and bromate (Legube et al., 2004) and unequivocally demonstrate that they are capable of accurately incorporating complex relationships that exist between precursor characteristics and chlorination conditions in forming DBPs even when using only a small fraction of available data for training. For raw water and conventionally treated water training with ∼20–25% of the available data was sufﬁcient to obtain N25 N80% (see Figs. 2 and 3). For the GAC treated water (Fig. 4), training with only 8% data was adequate to obtain N25 N80% potentially since this dataset contained numerous measurements (∼ 3500) representing ~60 drinking water treatment units that allowed the network to exclude repetitive measurements from training data. Satisfactory predictions of DBP formation in NF permeate waters shown in Fig. 5 required two hidden layers, each containing 4– 8 neurons, unlike the other three networks (for untreated water,

Fig. 2. Comparisons of ANN predictions with experimental measurements for DBP formation in raw waters.

4206

P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210

Fig. 3. Comparisons of ANN predictions with experimental measurements for DBP formation in conventionally treated waters.

conventionally treated water, and GAC treated water), where only one hidden layer was sufﬁcient. The scatter observed in Figs. 2–5 can be partially attributed to geographical diversity of source waters, different treatment schemes, pH, and chemicals employed at individual locations, seasonal changes in water quality (especially NOM characteristics), varying design and operational conditions at each location, differences in GAC and membrane type, small variations in ﬂow rates which are difﬁcult to maintain and monitor precisely during large-scale on-site experiments and so on (Bond and DiGiano, 2004). It should be emphasized that changes in coagulant dosage and pH, type of coagulant, ﬁltration conditions and ﬁlter design in conventional treatment, empty bed contact time (EBCT) and pretreatment in GAC adsorption, type of membrane, ﬂux and recoveries in NF with season and location were all modeled using only one ANN for each unit process. Importantly, in spite of this variability, ANNs were able to satisfactorily predict DBP concentrations in all cases with meaningful input parameters demonstrating their robustness and ability to accurately model THM, HAA, and TOX formation in a variety of water treatment scenarios. Further, process variables (e.g. nanoﬁlter permeate ﬂux and feed water recovery, GAC EBCT, particle size and surface area, etc.) were not explicitly used as inputs. Rather, efﬂuent water quality parameters obtained in a range of process operating conditions were input to ANNs. Accurate DBP predictions even in the absence of operating parameters imply that ANNs inherently captured the role of changing nature, characteristics, and concentrations of precursors with treatment (e.g. changing molecular weight distribution and functionality, speciﬁc ultraviolet absorbance at 254 nm, hydrophobicity, etc.).

3.3. Error distribution ANNs' predictive ability was evaluated in terms of the overall distribution of absolute relative error (Bowen et al., 1998) for TOX, THM4, and HAA6 in each of the waters. As observed in Fig. 6, using less than 25% of experimental measurements for training was still sufﬁcient to predict the majority of observations within 10% absolute relative error (N10 N60%) under a wide range of operational and water quality conditions. Our results demonstrate that size of training datasets could be substantially reduced compared with the more than 50% used in a previous study employing ANNs for predicting THM concentrations (Rodriguez et al., 2003). Similar results were also obtained for individual THM and HAA species in our study (see Table 2), which demonstrates that ANNs can satisfactorily estimate DBP concentrations during municipal water treatment even when using only a small fraction of available data for training. Reducing the training burden on ANNs is a practically important issue since simulated distribution system (SDS) testing and DBP analysis is time consuming, requires well trained laboratory personnel, and consequently expensive to conduct. 3.4. Relative importance of input variables Neural networks' ability of partitioning the inﬂuence of input variables to the output was exploited in a manner similar to interpreting independent variables' contributions to a dependent variable in regression equations (Garson, 1991; Goh, 1994). These contributions expressed as percent relative importance (calculated using Eq. (4)) were used to interpret input–output relations in terms

Fig. 4. Comparisons of ANN predictions with experimental measurements for DBP formation in GAC efﬂuent.

P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210

4207

Fig. 5. Comparisons of ANN predictions with experimental measurements for DBP formation in nanoﬁltered waters.

of the chemistry of DBP formation. Table 3 summarizes the relative importance (in percentage terms) of each of the 7 inputs employed to predict TOX, THM4, and HAA6 concentrations in this study. Even though these are purely empirical predictions, some of the weights are consistent with mechanistic interpretations. For example, DOC was by far the most important factor for DBP formation in raw water accounting for ∼ 40% weight for THM4, HAA6, and TOX. This is consistent with the most popular current approach for DBP control, which is to reduce NOM concentrations prior to chlorination. However, DOC was not always the most important factor in treated waters, especially for NF and GAC, suggesting different DBP formation mechanisms in untreated- and treated-waters. This result conﬁrms PCA results summarized in Fig. 1 and the need to derive separate ANNs for each water type. For each DBP, the relative importance of Br− ion concentrations was higher for GAC efﬂuent and NF permeate compared with raw water and conventional treatment, which is attributed to the large increase in Br−/DOC ratio by GAC and NF technologies leading to the preferential formation of the highly brominated DBPs (Chang et al., 2001; Chellam and Krasner, 2001; Singer, 1999; Symons et al., 1993). These results are discussed in more detail in the next section. Chlorine dose was the most important simulated distribution system (SDS) parameter compared with contact time, temperature, and pH. This is also consistent with current practice of reducing disinfectant dosage to reduce DBP concentrations. 3.5. Changes in DBP speciation with treatment As discussed in the previous sections and Table 2, ANNs were able to statistically predict not only total THMs and HAA6 but also changes

in concentrations of individual THM and HAA species for raw and treated waters. Additionally, neuronal connection weights were meaningful from the standpoint of the chemistry underlying DBP formation. Hence, ANNs appear to be able to capture at least certain mechanistic aspects of DBP formation and not just make purely empirical calculations. This suggests that they are more robust than purely statistical regression models. In this section, the role of water treatment processes in inducing changes in individual DBP species is considered in more detail. Conventional treatment, GAC, and NF remove NOM to a much greater extent compared with the bromide ion increasing the Br−/DOC ratio. Therefore, these treatment processes not only reduce total DBP formation but also change THM and HAA speciation (Chellam and Krasner, 2001; Gould et al., 1983; Symons et al., 1996). This is particularly important since studies indicate increasing carcinogenicity and mutagenicity with bromine substitution e.g. (Myllykangas et al., 2003). Effects of treatment using a bituminous coal based GAC (F-400, Calgon Corp.) with 15-minute empty bed contact time on DBP control and changing THM speciation observed in this study are summarized in Fig. 7. The breakthrough curves for TOC, SDSTOX, and SDSTTHMs are seen in Fig. 7a. As expected, NOM removal by GAC decreased DBP precursor concentrations consequently decreasing TOX and total THM formation in the efﬂuent over time under SDS conditions. NOM removal also increased Br−/DOC which can be expected to inﬂuence DBP speciation. Fig. 7b depicts mole fractions of individual THM species as functions of Br−/DOC molar ratio. Under the experimental conditions investigated, CHCl3 monotonically decreased and CHBr3 monotonically increased with increasing Br−/DOC whereas the mixed bromochloro species

Table 2 Summary of ANN simulations of DBP concentrations in four water types. Raw water

Conventional treatment

DBP

N25 (%)

R2

Ntrain

Ntest

N25 (%)

R2

TOX CHCl3 CHBrCl2 CHBr2Cl CHBr3 THM4 CH2ClCOOH CHCl2COOH CCl3COOH CH2BrCOOH CHBr2COOH CHClBrCOOH HAA6

84 83 89 93 BMRL 83 99 85 88 BMRL 88 87 79

0.96 0.90 0.94 0.90

22 21 20 24

79 91 99 101

0.97 0.98 0.94 0.94

22 15 14 18

79 47 88 73

0.90 0.78 0.92

16 19 18

83 99 76

82 88 BMRL 90 97 85 88 85 84 90 83 86 82

GAC efﬂuent

NF permeate

Ntrain

Ntest

N25 (%)

R2

Ntrain

Ntest

N25 (%)

R2

Ntrain

Ntest

0.88 0.86

90 98

480 466

0.82 0.98 0.90 0.98 0.90 0.92 0.98 0.98 0.90 0.71

110 47 102 46 52 84 42 92 108 92

455 217 483 212 284 433 179 365 514 479

85 82 82 85 80 83 64 81 79 78 84 79 82

0.88 0.96 0.94 0.94 0.96 0.92 0.94 0.92 0.94 0.96 0.96 0.91 0.92

220 577 511 620 452 250 140 438 289 183 538 726 237

2885 2726 2996 2814 2662 3300 436 2739 2088 873 2892 2975 3200

83 83 85 87 87 80 BMRL 87 85 BMRL 85 82 81

0.92 0.98 0.98 0.96 0.75 0.92

53 48 63 70 80 42

292 247 264 304 272 319

0.98 0.98

71 48

250 182

0.92 0.96 0.90

54 45 41

287 252 274

BMRL means that the majority of measurements were below minimum reporting level.

4208

P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210

Fig. 6. Summary of relative error distributions for all ANN simulations.

peaked within the range of Br−/DOC ratios encountered. CHCl2Br that contains one mole Br per mole THM peaked at 5 μM/mM Br−/DOC. CHClBr2 that contains two moles Br per mole THM and peaked at twice the Br−/DOC at approximately 10 μM/mM. The bromine incorporation factor and bromide utilization in THMs was also quantiﬁed to study the degree of bromine substitution (Chellam and Krasner, 2001; Gould et al., 1983; Symons et al., 1993): 3

∑ k½CHCl3k Brk

Bromine incorporation factor =

k=0 3

ð5Þ

∑ ½CHCl3k Brk

k=0

3

∑ i × ½CHCl3−i Bri

Bromide utilization =

i=1

½Br−

:

ð6Þ

Bromide substitution parameters for the same GAC run (corresponding to Fig. 7a and b) are shown in Fig. 7c. Increases in Br−/ DOC increased total Br incorporation while simultaneously decreasing Cl incorporation into THMs. Br and Cl incorporation were found to be equal

at Br−/DOC of 8.2 μM/mM corresponding to Br−/Cl2 molar ratio of 0.044 conﬁrming that HOBr is more reactive than HOCl in forming THMs. The decreasing trend in bromide utilization with Br−/DOC in Fig. 7c can be attributed to reductions in DOC concentrations at a ﬁxed Br− concentration. Low NOM concentrations in the GAC efﬂuent signify the availability of only a very few sites for bromine substitution. Since HOBr is a more powerful halogenating agent than HOCl, the brominated DBPs are formed ﬁrst with bromine consuming the available sites on NOM. In precursor limited waters, bromide utilization is reduced because excess Br− cannot react further once all available NOM reactive sites are occupied. In other words, at the start of a GAC run (low DOC and high Br−/DOC), only a small fraction of the total Br− is substituted into NOM due to the paucity of total reactive sites and the majority of Br− cannot react, resulting in a low bromide utilization. As DOC breaks through over time during the course of a GAC run (increasing DOC and decreasing Br−/DOC), the number of sites available for substitution concomitantly increases allowing for greater bromide utilization. NF was extremely effective in DBP precursor control but also induced signiﬁcant shifts towards the brominated THM species. THM mole fractions in the NF feed water was in the order CHCl3 N CHCl2BrN CHClBr2 N CHBr3 (Fig. 8). Very high NOM removal combined with poor bromide ion removal by NF resulted in a large increase in the Br−/

Table 3 Relative importance of water quality parameters and chlorination conditions on DBP formation. Parameter

DOC UV254 Bromide Cl2 dose Temperature pH Contact time

Trihalomethanes (THM4)

Haloacetic acids (HAA6)

Total organic halide (TOX)

Raw

Conv.

GAC

NF

Raw

Conv.

GAC

NF

Raw

Conv.

GAC

NF

40 8 20 12 4 8 8

24 12 22 18 8 10 6

18 10 26 30 8 4 4

21 17 32 13 5 8 4

39 11 9 17 7 9 8

22 11 14 30 6 10 7

16 13 20 35 5 6 5

18 18 29 21 3 8 3

42 17 3 29 4 1 4

26 21 8 21 9 11 4

24 11 14 36 3 9 3

22 21 21 18 6 9 3

P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210

4209

Fig. 8. General increase of brominated THMs in NF permeate compared with the feed water.

tion in order to achieve a target chlorine residual of 0.75 mg/L at the end of SDS testing. Under these experimental conditions, Br−/Cl2 increased in a logarithmic manner as 9.2Ln(Br−/DOC) + 23.9, in which both ratios are expressed in μM/mM. Thus, the HOBr/HOCl ratio also increased with Br−/DOC, preferentially shifting DBP speciation towards the more brominated species. It should be noted that HAA speciation was difﬁcult to interpret quantitatively from the ICR treatment studies since only six of the nine HAA species containing chlorine and bromine were typically analyzed and several HAA species were often below minimum reporting levels. The interested reader can refer to earlier publications that have a detailed interpretation of changes in HAA speciation with treatment e.g. (Chellam and Krasner, 2001; Liang and Singer, 2003). 4. Implications and conclusions

Fig. 7. a. Breakthrough of NOM (measured as TOC) and precursors to TOX and TTHMs. Feed water TOC= 4.5 mg/L, SDSTOX = 223.7 mg Cl−/L, and SDSTTHM = 85.1 μg/L, Br− = 115 μg/L. b. Shift towards more brominated THM species with increasing Br−/DOC ratio in the GAC efﬂuent. SDS conditions: 6-hour hold time, pH 9, Cl2 residual 0.75 mg/L. c. Effects of Br−/DOC ratio on halogen incorporation and bromide utilization in THMs following GAC treatment.

DOC in the permeate water (60 μM/mM) compared to the feed water (1.8 μM/mM). This shifted the THM formation towards the brominated species in the permeate water where concentrations were in the order: CHClBr2 N CHCl2Br N CHCl3 N CHBr3. Even though NF signiﬁcantly removed total THM precursors, concentrations of the highly brominated species (CHBr3) actually increased in the permeate compared with the feed water. Similar observations have been made in other membrane– source water combinations as well (Chellam and Krasner, 2001). Changes in THM speciation summarized Figs. 7 and 8 would also have been inﬂuenced by changing Br−/Cl2 (Symons et al., 1993). A constant Cl2/TOC ratio could not be used in this study since a higher chlorine dose was required for waters with a higher TOC concentra-

Robust ANNs requiring low quantities of data for training satisfactorily predicted formation of total trihalomethanes, sum of six haloacetic acids, total organic halide, as well as individual THM and HAA species in chlorinated waters covering a geographically diverse area of the United States. Benchmarking predictions to an extensive set of experimental measurements demonstrated that ANNs can closely predict DBP concentrations (under SDS conditions) following conventional and advanced treatment. Hence, complex and nonlinear relationships between water quality parameters and chlorination conditions inﬂuencing DBP formation and speciation were successfully captured by ANNs suggesting that they are viable alternatives to bench-scale laboratory testing to simulate large-scale unit processes. In other words, ANNs could be successfully used for process optimization and control and even for evaluating changes in DBP formation when operating conditions are changed or when advanced technologies are implemented for NOM removal. Hence, ANNs are valuable tools to compare and select NOM removal alternatives and can also reduce the experimental burden associated with relatively expensive and time consuming pilot-scale tests. It should be emphasized that since DBP control in ICR treatment studies was evaluated through SDS testing, ANN models presented herein are not strictly applicable to predict DBP concentrations in existing fullscale distribution systems. Even though ANNs can provide water purveyors with quantitative estimates of DBP concentrations, they do not provide a comprehensive mechanistic understanding of the chemical reactions and kinetics involved in DBP formation e.g. (Hua and Reckhow, 2008; Liang and Singer, 2003; Obolensky and Singer, 2005). Importantly, as with all empirical models, care should be taken not to implement ANNs beyond the range of water quality parameters for which they were derived. Also, the existence of outliers in Figs. 2–5 demonstrates the inherent difﬁculties in accurately predicting individual DBP concentrations especially when a single ANN is applied to a very wide range

4210

P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210

of complex water chemistries, geographically diverse locations, and treatment parameters. In any case, all inputs for ANNs derived herein are simple to measure water quality parameters that are routinely monitored in drinking water facilities thereby facilitating their implementation to screen preliminary process alternatives for DBP control. Furthermore, ANNs have also been reported to closely predict kinetics of Giardia inactivation (Haas, 2004). Hence, ANNs appear to be able to respond to water quality variations and closely capture complex aqueous phase behavior of both protozoa and chemical contaminants. In contrast, mechanistic models are yet unavailable to predict either microorganism inactivation or DBP formation under conditions of drinking water treatment. Hence, ANNs appear to be a potentially useful tool to quantify the seemingly conﬂicting requirements of microbial and DBP regulations and subsequently make better decisions related to design and operation of drinking water facilities to simultaneously meet existing primary drinking water standards. Acknowledgments This research has been funded by a grant from the National Science Foundation CAREER program (BES-0134301). The contents do not necessarily reﬂect the views and policies of the sponsors nor does the mention of trade names or commercial products constitute endorsement or recommendation for use. References Allgeier SC, Shukairy HM, Westrick JJ. ICR treatment studies. J Am Wat Wks Assn 1998;90:70–82. Allgeier SC, Summers RS. Evaluating NF for DBP control with the RBSMT. J Am Wat Wks Assn 1995;87:87–99. Bond R, DiGiano FA. Evaluating GAC performance using the ICR database. J Am Wat Wks Assn 2004;96:96-104. Bowen WR, Jones MG, Yousef HNS. Prediction of the rate of crossﬂow membrane ultraﬁltration of colloids: a neural network approach. Chem Eng Sci 1998;53: 3793–802. Chang EE, Lin YP, Chiang PC. Effects of bromide on the formation of THMs and HAAs. Chemosphere 2001;43:1029–34. Chellam S, Krasner SW. Disinfection by-product relationships and speciation in chlorinated nanoﬁltered waters. Environ Sci Technol 2001;35:3988–99. Chellam S, Taylor JS. Simpliﬁed analysis of contaminant rejection during ground- and surface water nanoﬁltration under the information collection rule. Wat Res 2001;35:2460–74. Chen C, Zhang X, He W, Lu W, Han H. Comparison of seven kinds of drinking water treatment processes to enhance organic material removal: a pilot test. Sci Total Environ 2007;382:93-102. Chowdhury S, Champagne P, McLellan PJ. Models for predicting disinfection byproduct (DBP) formation in drinking waters: a chronological review. Sci Total Environ 2009;407:4189–206. EPA. DBP/ICR analytical methods manual (EPA 814-B-96-002); 1996. Cincinnati, OH. EPA. ICR treatment study database, version 1.0 (EPA 815-C-00-003); 2000. Cincinnati, OH. Garson GD. Interpreting neural network connection weights. AI Expert 1991;6:47–51. Goh ATC. Seismic liquefaction potential assessed by neural networks. J Geotech Eng 1994;120:1467–80. Gould JP, Fitchhorn LE, Urheim E. Formation of brominated trihalomethanes: extent and kinetics. In: Jolley RL, Brungs W, Cotruva J, Mattice J, Jacobs V, editors. Water chlorination: environmental impact and health effects. 4. Ann Arbor, MI: Ann Arbor Science Publishers; 1983. p. 297–310.

Haas CN. Neural networks provide superior description of Giardia lamblia inactivation by free chlorine. Wat Res 2004;38:3449–57. Hamidin N, Yu QJ, Connell DW. Human health risk assessment of chlorinated disinfection by-products in drinking water using a probabilistic approach. Wat Res 2008;42:3263–74. Hong HC, Liang Y, Han BP, Mazumder A, Wong MH. Modeling of trihalomethane (THM) formation via chlorination of the water from Dongjiang River (source water for Hong Kong's drinking water). Sci Total Environ 2007;385:48–54. Hua G, Reckhow DA. DBP formation during chlorination and chloramination: effect of reaction time, pH, dosage, and temperature. J Am Wat Wks Assn 2008;100:82–95. Koch B, Krasner SW, Sclimenti MJ, Schimpff WK. Predicting the formation of DBPs by the simulated distribution system. J Am Wat Wks Assn 1991;83:62–70. Legube B, Parinet B, Gelinet K, Berne F, Croue J-P. Modeling of bromate formation by ozonation of surface waters in drinking water treatment. Wat Res 2004;38: 2185–95. Liang L, Singer PC. Factors inﬂuencing the formation and relative distribution of haloacetic acids and trihalomethanes in drinking water. Environ Sci Technol 2003;37:2920–8. Massart DL, Vandeginste BGM, Buydens LMC, De Jong S, Lewi PJ, Smeyers-Verbeke J. Handbook of chemometrics and qualimetrics: part A. Amsterdam, The Netherlands: Elsevier Science; 1997. McGuire MJ, McLain JL, Obolensky A. Information Collection Rule Data Analysis. Denver, CO: Awwa Research Foundation; 2002. Milot J, Rodriguez MJ, Serodes JB. Contribution of neural networks for modeling trihalomethanes occurrence in drinking water. J Water Resour Plann Manage 2002;128:370–6. Myllykangas T, Nissinen TK, Mäki-Paakkanen J, Hirvonen A, Vartiainen T. Bromide affecting drinking water mutagenicity. Chemosphere 2003;53:745–56. Obolensky A, Singer PC. Halogen substitution patterns among disinfection byproducts in the information collection rule database. Environ Sci Technol 2005;39:2719–30. Obolensky A, Singer PC. Development and Interpretation of Disinfection Byproduct Formation Models Using the Information Collection Rule Database. Environ Sci Technol 2008;42(15):5654–60. Platikanov S, Puig X, Martín J, Tauler R. Chemometric modeling and prediction of trihalomethane formation in Barcelona's water works plant. Wat Res 2007;41(15): 3394–406. Rodriguez MJ, Milot J, Sérodes J-B. Predicting trihalomethane formation in chlorinated waters using multivariate regression and neural networks. J Wat Supply Res Tech AQUA 2003;52:199–215. Sadiq R, Rodriguez MJ. Disinfection by-products (DBPs) in drinking water and predictive models for their occurrence: a review. Sci Total Environ 2004a;321: 21–46. Sadiq R, Rodriguez MJ. Fuzzy synthetic evaluation of disinfection by-products—a riskbased indexing system. J Environ Manage 2004b;73:1-13. Shimazu H, Kouchi M, Sugita Y, Yonekura Y, Kumano H, Hashiwata K, Hirota T, Ozaki N, Fukushima T. Developing a model for disinfection by-products based on multiple regression analysis in a water distribution system. J Wat Supply Res Tech AQUA 2005;54(4):225–37. Singer PC. Formation and control of disinfection by-products in drinking water. Denver, CO: American Water Works Association; 1999. Sohn J, Amy G, Cho J, Lee Y, Yoon Y. Disinfectant decay and disinfection by-products formation model development: chlorination and ozonation by-products. Wat Res 2004;38:2461–78. Symons JM, Krasner SW, Sclimenti MJ, Simms LA, Sorensen HW, Speitel GE, et al. Inﬂuence of bromide ion on trihalomethane and haloacetic acid formation. In: Minear R, Amy GL, editors. Disinfection by-products in water treatment: the chemistry of their formation and control. Boca Raton, FL: Lewis Publishers; 1996. p. 91-130. Symons JM, Krasner SW, Simms LA, Sclimenti M. Measurement of THM and precursor concentrations revisited: the effect of bromide ion. J Am Wat Wks Assn 1993;85: 51–62. Uyak V, Ozdemir K, Toroz I. Multiple linear regression modeling of disinfection byproducts formation in Istanbul drinking water reservoirs. Sci Total Environ 2007;378:269–80. Westerhoff P, Debroux J, Amy GL, Gatel D, Mary V, Cavard J. Applying DBP models to full-scale plants. J Am Wat Wks Assn 2000;92:89-102.