Science of the Total Environment 408 (2010) 4202–4210
Contents lists available at ScienceDirect
Science of the Total Environment j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / s c i t o t e n v
Disinfection by-product formation following chlorination of drinking water: Artificial neural network models and changes in speciation with treatment Pranav Kulkarni a,1, Shankararaman Chellam a,b,⁎,2 a b
Department of Civil and Environmental Engineering, University of Houston, Houston, TX 77204-4003, United States Department of Chemical and Biomolecular Engineering, University of Houston, Houston, TX 77204-4004, United States
a r t i c l e
i n f o
Article history: Received 17 February 2010 Received in revised form 21 May 2010 Accepted 29 May 2010 Available online 26 June 2010 Keywords: Artificial neural networks Disinfection by-products Drinking water Trihalomethanes Haloacetic acids Total organic halide Chlorine disinfection Nanofiltration Granular activated carbon
a b s t r a c t Artificial neural network (ANN) models were developed to predict disinfection by-product (DBP) formation during municipal drinking water treatment using the Information Collection Rule Treatment Studies database complied by the United States Environmental Protection Agency. The formation of trihalomethanes (THMs), haloacetic acids (HAAs), and total organic halide (TOX) upon chlorination of untreated water, and after conventional treatment, granular activated carbon treatment, and nanofiltration were quantified using ANNs. Highly accurate predictions of DBP concentrations were possible using physically meaningful water quality parameters as ANN inputs including dissolved organic carbon (DOC) concentration, ultraviolet absorbance at 254 nm and one cm path length (UV254), bromide ion concentration (Br−), chlorine dose, chlorination pH, contact time, and reaction temperature. This highlights the ability of ANNs to closely capture the highly complex and non-linear relationships underlying DBP formation. Accurate simulations suggest the potential use of ANNs for process control and optimization, comparison of treatment alternatives for DBP control prior to piloting, and even to reduce the number of experiments to evaluate water quality variations when operating conditions are changed. Changes in THM and HAA speciation and bromine substitution patterns following treatment are also discussed. © 2010 Elsevier B.V. All rights reserved.
1. Introduction In addition to inactivating microorganisms, chemical disinfectants such as chlorine also react with natural organic matter (NOM) and bromide ion in water to form numerous disinfection by-products (DBPs), which have been implicated as human mutagens, carcinogens, and teratogens (Hamidin et al., 2008). Trihalomethanes (THMs) and haloacetic acids (HAAs) constitute the major halogenated DBPs currently regulated in drinking water, accounting for approximately half the total organic halide (TOX) concentration. Since THMs and HAAs are not typically present in the source water but are by-products formed during chlorination as an unintended consequence, they are most often controlled by reducing the concentrations of their precursors (particularly NOM) prior to adding chlorine. In this manuscript, we consider three important water treatment processes employed for NOM (and DBP precursor) removal from
⁎ Corresponding author. Department of Civil and Environmental Engineering, University of Houston, Houston, TX 77204-4003, United States. Tel.: +1 713 743 4265; fax: +1 713 743 4260. E-mail address:
[email protected] (S. Chellam). 1 Present address: Trinity Consultants, Houston, TX, United States. 2 Originally submitted to Science of the Total Environment on February 17, 2010. Revised version submitted on May 23, 2010. 0048-9697/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.scitotenv.2010.05.040
drinking water sources; (i) conventional treatment (coagulation– flocculation–sedimentation–media filtration), (ii) granular activated carbon (GAC) adsorption, and (iii) nanofiltration (NF) (Chen et al., 2007). We are particularly interested in the formation of THMs, HAAs, and TOX following these processes using free chlorine as the disinfectant. It should be emphasized that even though current regulations do not include TOX, good treatment practices necessitate its control as well. It is difficult to derive purely mechanistic models of DBP formation in natural waters due to the inherent heterogeneity of NOM, the complex background chemistry of municipal water supplies, and large variations in water quality of surface water supplies with season and location in terms of NOM concentrations, origin, and characteristics. Additionally, since the removal of specific NOM components depends on the treatment processes employed (e.g. coagulation and GAC preferentially remove hydrophobic portions and NF preferentially removes higher molecular weight portions) the DBP yield is changed upon treatment further complicating the chemistry and prediction of DBP formation. Hence, DBP mass concentrations [DBP] are typically modeled empirically by linearly regressing each of the water quality parameters influencing DBP formation including dissolved organic carbon concentration (DOC), ultraviolet absorbance at 254 nm and one cm path length (UV254), bromide ion concentration (Br−), chlorine dose (Cl2), chlorination pH (pH), contact time (Time), and
P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210
reaction temperature (Temp) specific to each water supply and treatment process: a
b
½DBP = k × DOC × UV254 × Br
−c
d
e
f
g
× Cl2 × pH × Time × Temp
ð1Þ
where k, a, b, c, d, e, f, and g are empirical constants. Log-linear power functions similar to Eq. (1) are extensively employed to model THM and HAA formation e.g. (Chowdhury et al., 2009; Hong et al., 2007; Sadiq and Rodriguez, 2004a; Sohn et al., 2004; Uyak et al., 2007; Westerhoff et al., 2000) even though they are best suited only to predict central tendencies of databases used to develop them. In contrast, artificial neural networks (ANNs) have the capability to approximate any function (and its derivatives) to any degree of accuracy. Also, the superior ability of ANNs to handle noisy, distorted multivariate data makes them a more powerful modeling tool compared with regression models such as Eq. (1). Paradoxically, even though ANNs have the potential to better predict DBP formation compared to multivariate regression models, only a very limited number of studies have considered ANNs for DBP formation upon chlorination. Rodriguez and co-workers (Milot et al., 2002; Rodriguez et al., 2003) focused exclusively on THMs (HAAs and TOX were excluded in these studies) and also evaluated DBPs' health risks using a fuzzy logic models (Sadiq and Rodriguez, 2004b). Additional research is necessary to specifically evaluate the capability of ANNs to predict DBP formation following many water treatment processes implemented for DBP control. The principal objective of this research is to derive accurate ANN models for THM, HAA, and TOX formation following chlorination of raw and treated (conventional treatment, GAC, and NF) waters. We also provide additional experimental data on changes in THM and HAA speciation focusing on variations in bromine substitution with treatment. ANNs were implemented and validated using a large dataset that includes an extensive set of bench-scale, pilot-scale, and full-scale experiments from numerous water treatment plants located in the United States (Allgeier et al., 1998). Network connection weights are also interpreted quantitatively to develop more insights into the relative importance of individual physicochemical factors known to influence DBP formation and speciation. 2. Neural networks A commercially available software program (Matlab neural network toolbox 6.1, The Math Works, Inc., Natick, MA) was used to implement ANNs on a personal computer. In this study, feed forward, back propagation ANNs consisting of an input layer, one or two hidden layers, and an output layer were developed. To minimize network complexity and improve its performance, the least number of physically meaningful input parameters was employed; viz. DOC concentration, UV254, Br− concentration and chlorination conditions including Cl2 dose, contact time, pH, and temperature, which are the same parameters in Eq. (1). The output consisted of a single neuron representing TOX, THM or HAA concentrations. As commonly practiced, the number of hidden layers and the number of hidden neurons were determined iteratively using trial and error. In this study, one or two hidden layers consisting of 4–8 neurons were found to be satisfactory for all simulations. To improve the efficiency of batch training, the Levenberg–Marquardt algorithm with optimum learning rate between 0.01 and 0.0001 was chosen through experimentation to avoid instability and excessive convergence time. All synaptic weights were initialized randomly in the range (−0.5, +0.5) and accordingly readjusted (via back propagation) to reduce the difference between actual and desired output in terms of sum of squared error (SSE); ntrain 2 SSE = ∑ ½DBP obs −½DBP pred i=1
ð2Þ
4203
where [DBP]obs is the experimental or observed DBP concentration and [DBP]pred is the corresponding ANN output or prediction and ntrain is the number of observations employed for ANN training. For each simulation, the network was trained iteratively until SSE b10− 5 or the maximum gradient was reached. As is usually necessary, data were normalized by the corresponding maximum value to put them in the range 0–1. Training data were carefully chosen so that they had a greater descriptive ability while simultaneously making an effort to use a minimum number of observations. Because ANNs are better suited to interpolate rather than extrapolate, the maximum and minimum values of each input parameter were always chosen to train the network leaving intermediate measurements for validation. The optimum network architecture for each type of treatment and untreated water was obtained by multiple runs. Network validation was performed by providing only those input values that were not included in the original training set. The quality of DBP predictions in comparison to the desired outputs in the validation dataset was evaluated both in terms of the regression coefficient and its N25 value, which represents percentage of predictions that have less than 25% absolute relative error calculated as:
Absolute Relative Error =
absð½DBPpred − ½DBPobs Þ ½DBPobs
:
ð3Þ
The contributions of each input variable to predict DBP concentrations were determined by the Garson weight partitioning method using the absolute values of the neuronal connection weights (Garson, 1991; Goh, 1994): nH
2
∑4
j=1
Relative importance of input variable v = nv
2
3 ivj nv
∑ ikj
nH
0
∑ 4∑ @
i=1
Oj 5
k=1
j=1
ivj nv
∑ ikj
13
ð4Þ
Oj A5
k=1
where, nv and nH are the number of input and hidden neurons respectively, ij and Oj denote the absolute value of connection weights between input to hidden layer and hidden to output layer, respectively. Eq. (4) is best suited to interpret the trends in relative importance of input variables rather than the calculated absolute values. 2.1. Experimental dataset All data used in this manuscript are available in the ICR Treatment Study Database developed by the United States Environmental Protection Agency using bench-, pilot-, and full-scale DBP precursor removal data provided by public water systems meeting certain criteria (Allgeier et al., 1998). GAC and NF studies were performed encompassing seasonal variations in ground- and surface waters representing numerous full-scale water treatment plants across the United States (Allgeier and Summers, 1995; Bond and DiGiano, 2004). It should be emphasized that under the ICR Treatment Study requirements, DBP precursor removal was evaluated using simulated distribution system (SDS) testing (Koch et al., 1991). Limited available data suggests a greater complexity in predicting DBPs from actual real-world distribution systems compared with studies employing SDS testing e.g. (Platikanov et al., 2007; Shimazu et al., 2005). This is because distribution systems have several nonidealities (e.g. dead-zones, potential presence of biofilms, different chlorine decay kinetics caused by pipe wall roughness, a distribution of residence times, non-uniform disinfectant concentrations, etc.) that cannot be accurately simulated in a simple SDS test run in batch mode.
4204
P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210
In any case, all DBP data used in this research were generated through SDS testing under the ICR Treatment Study requirement (rather than actual measurements from various locations within an actual distribution system). Qualifying municipalities evaluated GAC and NF to provide technical and economic data to assess the feasibility of these technologies for precursor control in anticipation of more stringent DBP regulations. SDS testing allowed municipalities to evaluate various design parameters for these advanced technologies under site specific conditions without sending the treated water to their customers through the distribution system. In other words, the purpose of ICR Treatment Studies was not to measure DBP concentrations in full-scale distribution systems following existing treatment techniques. Rather, its purpose was to determine the extent of DBP control achieved by GAC and NF for a geographically diverse set of water supplies so that they can be potentially installed for largescale water treatment in the future. A separate 18-month DBP monitoring requirement was also required separately under the ICR, wherein water samples were collected at numerous locations within full-scale water treatment plants including the distribution system (McGuire et al., 2002; Obolensky and Singer, 2005). Multiple linear regression models incorporating important water chemistry parameters have been recently developed for DBP formation using these data (Obolensky and Singer, 2008). A potential research opportunity is to use this monitoring (as opposed to the Treatment Study) database to compare the relative accuracy of regression and ANNs to predict DBP concentrations in real-world distribution systems. Detailed information on pretreatment, unit processes, water quality, etc. is available in the ICR database (EPA, 2000). In addition to the extensive data obtained, the ICR imposed stringent quality control requirements for the conduct of treatment studies as well as analytical methods for water quality and DBP measurements, making it a very high quality database ideally suited to develop DBP formation models. A summary of the database with the range of each of the seven ANN input parameters (DOC, UV254, Br−, chlorination conditions including Cl2 dose, contact time, pH, and temperature) is given in Table 1. The Environmental Protection Agency established Minimum Reporting Levels (MRLs) separate from method detection limits for each water quality parameter corresponding to laboratories' ability to measure analytes with a predetermined accuracy and precision. The MRLs specified under the ICR have been reported elsewhere (Chellam and Taylor, 2001; EPA, 1996). If any of the seven ANN input values was below the minimum reporting level (BMRL), that entire set of readings was excluded whereas output values (DBP concentrations) with BMRL values were assigned a value of half the MRL. Datasets wherein ammonia was detected were excluded in this research because only DBP formation with free chlorine is considered in this manuscript. Based on the water treatment unit processes used and available data, the entire database was divided in 4 subsets, viz. untreated water (5 datasets), conventional treatment (30 datasets), water treated by GAC (57 datasets), and NF (26 datasets). Each benchscale study includes four seasonal THMs, HAAs and TOX measurements at different operating conditions along with related operational and water quality parameters such as sampling time, operation time, pH, turbidity, DOC, UV254, Br−, and simulated distribution system conditions using free chlorine. Bench-scale experiments were per-
formed either with a flat sheet of a commercially available nanofiltration membrane (Allgeier and Summers, 1995) or using the rapid small scale column test protocol to evaluate GAC treatment (Bond and DiGiano, 2004). Simulated distribution system tests (Koch et al., 1991) were also performed in existing full-scale municipal water treatment plants that employed NF or GAC. (Note that even fullscale treatment facilities were not required to sample from their existing distribution system but to report SDS testing results.) In summary, a large number of measurements spanning a wide range of influent water quality parameters from surface- and ground-waters, and from unit processes with varying design variables, pretreatments, and operating conditions have been used to model the complex relationships that exist between DBP concentrations and various factors responsible for their formation. 3. Results and discussion 3.1. Need to derive separate ANNs for each treatment technique Principal Component Analysis (PCA) was employed to determine if a single neural network was capable of predicting DBP concentrations formed in the four waters (raw, conventional treatment, GAC, and NF) or whether four separate ANNs were necessary, one for each water type (Massart et al., 1997). Specifically, PCA was carried out to assess the associations between the contributions of raw water and the three DBP precursor removal techniques (conventional treatment, GAC, and NF) in THM, HAA, and TOX formation. PCA also reduced the number of independent variables by generating a new coordinate system of uncorrelated variables called principal components. One unique feature of PCA is that variables with similar properties group together separating out the variables with dissimilar properties. A data subset in which all input variables except DOC varied only in a narrow range for untreated water and the other 3 unit processes was initially identified. All DBP concentrations in this subset were first normalized by the corresponding DOC concentrations to calculate the DBP yield. Next, PCA was performed using singular value decomposition. Fig. 1 depicts the PCA plot in which each symbol represents these normalized concentrations of TOX, THM4 (sum of the four THM species, viz. CHCl3, CHClBr2, CHCl2Br, and CHBr3), and HAA6 (sum of six HAA species, viz. CH2ClCOOH, CHCl2COOH, CCl3COOH, CH2BrCOOH, CHBr2COOH, and CHClBrCOOH) from the subset. (Note that during the development of the ICR database stable analytical standards for three other HAAs containing chlorine and bromine viz. CCl2BrCOOH, CClBr2COOH, and CBr3COOH were not widely available commercially.) Fig. 1 shows the distinct separation of each of the four symbols, even appearing in different quadrants, implying large variations in the extent of DBP formation in raw water and after conventional treatment, NF, and GAC under similar chlorination conditions. In mechanistic terms, PCA demonstrates that DBP yields and formation mechanisms depend on the type of precursor removal method employed. Hence, individual ANN models for each of the treatment process and raw water were necessary to capture the differing underlying aqueous chemistries and NOM characteristics that resulted in varying DBP yields upon chlorination. These results are consistent with the practice of deriving separate regression models for different unit processes (Chowdhury et al., 2009; Legube et al., 2004; Sadiq and Rodriguez, 2004a; Sohn et al.,
Table 1 Summary of ANN input parameters for the treatment studies employed in this manuscript. Treatment
TOC (mg/L)
UV254 (cm− 1)
Br− (μg/L)
Cl2 dose (mg/L)
Temp. (°C)
pH
Contact time (h)
Untreated water Conventional treatment GAC treatment Nanofiltration
1.72–14.40 0.90–5.86 0.05–8.25 0.08–4.80
0.047–0.673 0.021–0.161 0.001–0.225 0.001–0.124
37–510 20–1850 20–1810 6–670
3.70–20.0 0.76–14.5 0.67–9.5 1.5–19.1
20.0–28.8 4.0–30.2 1.0–33.0 5.4–28.0
7.7–9.1 5.9–10.1 5.9–10.4 5.5–9.5
6.3–83 1.9–120 1.8–120.0 6.0–72.0
P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210
4205
Fig. 1. Principal Component Analysis to determine the need for separate ANNs for individual water types. Raw, Conv., GAC, and NF denote raw water, conventionally treated water, GAC effluent, and NF permeate respectively.
2004) since the nature of DBP precursors (and consequently the yield and kinetics) are different in raw water and after various treatments. For example, coagulation is known to preferentially remove the fraction of NOM that is more hydrophobic, of higher molecular weight, and that has more binding sites (Singer, 1999). NF removes the higher molecular weight fractions, changes the specific ultraviolet absorbance (SUVA) and the Br−/DOC ratio between the feed and permeate waters, (Chellam and Krasner, 2001). Similarly, precursor removal by GAC is a function of pore size distribution, NOM molecular weight distribution and heterogeneity (Singer, 1999). In other words, separate ANNs were necessary since NOM reactivity towards chlorine and DBP yield is changed based on the treatment process employed. Therefore, separate neural networks were developed to predict DBP formation in raw water, as well as after conventional treatment, GAC adsorption, and NF. The network configuration and parameters such as learning rate, number of hidden layers, number of neurons in each layer, initial weights, etc. were varied for each simulation prior to predictions to obtain the most reliable ANN model in each case. The optimal neural network architecture that gave best N25 values for each water type was determined using this procedure. 3.2. ANN predictions of DBP concentrations Figs. 2, 3, 4, and 5 depict comparisons of ANN predictions of TOX, THM4, and HAA6 concentrations with experimental observations in untreated water, and waters purified by conventional treatment, GAC adsorption, and NF respectively. Measurements from bench-, pilot-, and full-scale treatment processes are all shown. In each case, the number of data points used for training (Ntrain), validation (Ntest),
percentage of predictions within 25% absolute relative error (N25), and the regression coefficient (R2) is also reported. Note that to be more stringent, only experimental measurements employed for validating the neural network model are shown. Training datasets are not depicted since they were extremely well predicted by ANN models and were superposed directly on the line of perfect agreement. A summary of the number of points used for ANN training and validation along with the N25 values and regression coefficients for individual DBPs are given in Table 2. As observed in Table 2 and Figs. 2, 3, 4, and 5 neural networks gave consistently high N25 values (77– 98%) and high regression coefficients (0.78–0.98) even when using only 7–22% data for training for each water type. Good THM, HAA, and TOX predictions using ANNs agree with earlier reports for THMs (Rodriguez et al., 2003) and bromate (Legube et al., 2004) and unequivocally demonstrate that they are capable of accurately incorporating complex relationships that exist between precursor characteristics and chlorination conditions in forming DBPs even when using only a small fraction of available data for training. For raw water and conventionally treated water training with ∼20–25% of the available data was sufficient to obtain N25 N80% (see Figs. 2 and 3). For the GAC treated water (Fig. 4), training with only 8% data was adequate to obtain N25 N80% potentially since this dataset contained numerous measurements (∼ 3500) representing ~60 drinking water treatment units that allowed the network to exclude repetitive measurements from training data. Satisfactory predictions of DBP formation in NF permeate waters shown in Fig. 5 required two hidden layers, each containing 4– 8 neurons, unlike the other three networks (for untreated water,
Fig. 2. Comparisons of ANN predictions with experimental measurements for DBP formation in raw waters.
4206
P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210
Fig. 3. Comparisons of ANN predictions with experimental measurements for DBP formation in conventionally treated waters.
conventionally treated water, and GAC treated water), where only one hidden layer was sufficient. The scatter observed in Figs. 2–5 can be partially attributed to geographical diversity of source waters, different treatment schemes, pH, and chemicals employed at individual locations, seasonal changes in water quality (especially NOM characteristics), varying design and operational conditions at each location, differences in GAC and membrane type, small variations in flow rates which are difficult to maintain and monitor precisely during large-scale on-site experiments and so on (Bond and DiGiano, 2004). It should be emphasized that changes in coagulant dosage and pH, type of coagulant, filtration conditions and filter design in conventional treatment, empty bed contact time (EBCT) and pretreatment in GAC adsorption, type of membrane, flux and recoveries in NF with season and location were all modeled using only one ANN for each unit process. Importantly, in spite of this variability, ANNs were able to satisfactorily predict DBP concentrations in all cases with meaningful input parameters demonstrating their robustness and ability to accurately model THM, HAA, and TOX formation in a variety of water treatment scenarios. Further, process variables (e.g. nanofilter permeate flux and feed water recovery, GAC EBCT, particle size and surface area, etc.) were not explicitly used as inputs. Rather, effluent water quality parameters obtained in a range of process operating conditions were input to ANNs. Accurate DBP predictions even in the absence of operating parameters imply that ANNs inherently captured the role of changing nature, characteristics, and concentrations of precursors with treatment (e.g. changing molecular weight distribution and functionality, specific ultraviolet absorbance at 254 nm, hydrophobicity, etc.).
3.3. Error distribution ANNs' predictive ability was evaluated in terms of the overall distribution of absolute relative error (Bowen et al., 1998) for TOX, THM4, and HAA6 in each of the waters. As observed in Fig. 6, using less than 25% of experimental measurements for training was still sufficient to predict the majority of observations within 10% absolute relative error (N10 N60%) under a wide range of operational and water quality conditions. Our results demonstrate that size of training datasets could be substantially reduced compared with the more than 50% used in a previous study employing ANNs for predicting THM concentrations (Rodriguez et al., 2003). Similar results were also obtained for individual THM and HAA species in our study (see Table 2), which demonstrates that ANNs can satisfactorily estimate DBP concentrations during municipal water treatment even when using only a small fraction of available data for training. Reducing the training burden on ANNs is a practically important issue since simulated distribution system (SDS) testing and DBP analysis is time consuming, requires well trained laboratory personnel, and consequently expensive to conduct. 3.4. Relative importance of input variables Neural networks' ability of partitioning the influence of input variables to the output was exploited in a manner similar to interpreting independent variables' contributions to a dependent variable in regression equations (Garson, 1991; Goh, 1994). These contributions expressed as percent relative importance (calculated using Eq. (4)) were used to interpret input–output relations in terms
Fig. 4. Comparisons of ANN predictions with experimental measurements for DBP formation in GAC effluent.
P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210
4207
Fig. 5. Comparisons of ANN predictions with experimental measurements for DBP formation in nanofiltered waters.
of the chemistry of DBP formation. Table 3 summarizes the relative importance (in percentage terms) of each of the 7 inputs employed to predict TOX, THM4, and HAA6 concentrations in this study. Even though these are purely empirical predictions, some of the weights are consistent with mechanistic interpretations. For example, DOC was by far the most important factor for DBP formation in raw water accounting for ∼ 40% weight for THM4, HAA6, and TOX. This is consistent with the most popular current approach for DBP control, which is to reduce NOM concentrations prior to chlorination. However, DOC was not always the most important factor in treated waters, especially for NF and GAC, suggesting different DBP formation mechanisms in untreated- and treated-waters. This result confirms PCA results summarized in Fig. 1 and the need to derive separate ANNs for each water type. For each DBP, the relative importance of Br− ion concentrations was higher for GAC effluent and NF permeate compared with raw water and conventional treatment, which is attributed to the large increase in Br−/DOC ratio by GAC and NF technologies leading to the preferential formation of the highly brominated DBPs (Chang et al., 2001; Chellam and Krasner, 2001; Singer, 1999; Symons et al., 1993). These results are discussed in more detail in the next section. Chlorine dose was the most important simulated distribution system (SDS) parameter compared with contact time, temperature, and pH. This is also consistent with current practice of reducing disinfectant dosage to reduce DBP concentrations. 3.5. Changes in DBP speciation with treatment As discussed in the previous sections and Table 2, ANNs were able to statistically predict not only total THMs and HAA6 but also changes
in concentrations of individual THM and HAA species for raw and treated waters. Additionally, neuronal connection weights were meaningful from the standpoint of the chemistry underlying DBP formation. Hence, ANNs appear to be able to capture at least certain mechanistic aspects of DBP formation and not just make purely empirical calculations. This suggests that they are more robust than purely statistical regression models. In this section, the role of water treatment processes in inducing changes in individual DBP species is considered in more detail. Conventional treatment, GAC, and NF remove NOM to a much greater extent compared with the bromide ion increasing the Br−/DOC ratio. Therefore, these treatment processes not only reduce total DBP formation but also change THM and HAA speciation (Chellam and Krasner, 2001; Gould et al., 1983; Symons et al., 1996). This is particularly important since studies indicate increasing carcinogenicity and mutagenicity with bromine substitution e.g. (Myllykangas et al., 2003). Effects of treatment using a bituminous coal based GAC (F-400, Calgon Corp.) with 15-minute empty bed contact time on DBP control and changing THM speciation observed in this study are summarized in Fig. 7. The breakthrough curves for TOC, SDSTOX, and SDSTTHMs are seen in Fig. 7a. As expected, NOM removal by GAC decreased DBP precursor concentrations consequently decreasing TOX and total THM formation in the effluent over time under SDS conditions. NOM removal also increased Br−/DOC which can be expected to influence DBP speciation. Fig. 7b depicts mole fractions of individual THM species as functions of Br−/DOC molar ratio. Under the experimental conditions investigated, CHCl3 monotonically decreased and CHBr3 monotonically increased with increasing Br−/DOC whereas the mixed bromochloro species
Table 2 Summary of ANN simulations of DBP concentrations in four water types. Raw water
Conventional treatment
DBP
N25 (%)
R2
Ntrain
Ntest
N25 (%)
R2
TOX CHCl3 CHBrCl2 CHBr2Cl CHBr3 THM4 CH2ClCOOH CHCl2COOH CCl3COOH CH2BrCOOH CHBr2COOH CHClBrCOOH HAA6
84 83 89 93 BMRL 83 99 85 88 BMRL 88 87 79
0.96 0.90 0.94 0.90
22 21 20 24
79 91 99 101
0.97 0.98 0.94 0.94
22 15 14 18
79 47 88 73
0.90 0.78 0.92
16 19 18
83 99 76
82 88 BMRL 90 97 85 88 85 84 90 83 86 82
GAC effluent
NF permeate
Ntrain
Ntest
N25 (%)
R2
Ntrain
Ntest
N25 (%)
R2
Ntrain
Ntest
0.88 0.86
90 98
480 466
0.82 0.98 0.90 0.98 0.90 0.92 0.98 0.98 0.90 0.71
110 47 102 46 52 84 42 92 108 92
455 217 483 212 284 433 179 365 514 479
85 82 82 85 80 83 64 81 79 78 84 79 82
0.88 0.96 0.94 0.94 0.96 0.92 0.94 0.92 0.94 0.96 0.96 0.91 0.92
220 577 511 620 452 250 140 438 289 183 538 726 237
2885 2726 2996 2814 2662 3300 436 2739 2088 873 2892 2975 3200
83 83 85 87 87 80 BMRL 87 85 BMRL 85 82 81
0.92 0.98 0.98 0.96 0.75 0.92
53 48 63 70 80 42
292 247 264 304 272 319
0.98 0.98
71 48
250 182
0.92 0.96 0.90
54 45 41
287 252 274
BMRL means that the majority of measurements were below minimum reporting level.
4208
P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210
Fig. 6. Summary of relative error distributions for all ANN simulations.
peaked within the range of Br−/DOC ratios encountered. CHCl2Br that contains one mole Br per mole THM peaked at 5 μM/mM Br−/DOC. CHClBr2 that contains two moles Br per mole THM and peaked at twice the Br−/DOC at approximately 10 μM/mM. The bromine incorporation factor and bromide utilization in THMs was also quantified to study the degree of bromine substitution (Chellam and Krasner, 2001; Gould et al., 1983; Symons et al., 1993): 3
∑ k½CHCl3k Brk
Bromine incorporation factor =
k=0 3
ð5Þ
∑ ½CHCl3k Brk
k=0
3
∑ i × ½CHCl3−i Bri
Bromide utilization =
i=1
½Br−
:
ð6Þ
Bromide substitution parameters for the same GAC run (corresponding to Fig. 7a and b) are shown in Fig. 7c. Increases in Br−/ DOC increased total Br incorporation while simultaneously decreasing Cl incorporation into THMs. Br and Cl incorporation were found to be equal
at Br−/DOC of 8.2 μM/mM corresponding to Br−/Cl2 molar ratio of 0.044 confirming that HOBr is more reactive than HOCl in forming THMs. The decreasing trend in bromide utilization with Br−/DOC in Fig. 7c can be attributed to reductions in DOC concentrations at a fixed Br− concentration. Low NOM concentrations in the GAC effluent signify the availability of only a very few sites for bromine substitution. Since HOBr is a more powerful halogenating agent than HOCl, the brominated DBPs are formed first with bromine consuming the available sites on NOM. In precursor limited waters, bromide utilization is reduced because excess Br− cannot react further once all available NOM reactive sites are occupied. In other words, at the start of a GAC run (low DOC and high Br−/DOC), only a small fraction of the total Br− is substituted into NOM due to the paucity of total reactive sites and the majority of Br− cannot react, resulting in a low bromide utilization. As DOC breaks through over time during the course of a GAC run (increasing DOC and decreasing Br−/DOC), the number of sites available for substitution concomitantly increases allowing for greater bromide utilization. NF was extremely effective in DBP precursor control but also induced significant shifts towards the brominated THM species. THM mole fractions in the NF feed water was in the order CHCl3 N CHCl2BrN CHClBr2 N CHBr3 (Fig. 8). Very high NOM removal combined with poor bromide ion removal by NF resulted in a large increase in the Br−/
Table 3 Relative importance of water quality parameters and chlorination conditions on DBP formation. Parameter
DOC UV254 Bromide Cl2 dose Temperature pH Contact time
Trihalomethanes (THM4)
Haloacetic acids (HAA6)
Total organic halide (TOX)
Raw
Conv.
GAC
NF
Raw
Conv.
GAC
NF
Raw
Conv.
GAC
NF
40 8 20 12 4 8 8
24 12 22 18 8 10 6
18 10 26 30 8 4 4
21 17 32 13 5 8 4
39 11 9 17 7 9 8
22 11 14 30 6 10 7
16 13 20 35 5 6 5
18 18 29 21 3 8 3
42 17 3 29 4 1 4
26 21 8 21 9 11 4
24 11 14 36 3 9 3
22 21 21 18 6 9 3
P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210
4209
Fig. 8. General increase of brominated THMs in NF permeate compared with the feed water.
tion in order to achieve a target chlorine residual of 0.75 mg/L at the end of SDS testing. Under these experimental conditions, Br−/Cl2 increased in a logarithmic manner as 9.2Ln(Br−/DOC) + 23.9, in which both ratios are expressed in μM/mM. Thus, the HOBr/HOCl ratio also increased with Br−/DOC, preferentially shifting DBP speciation towards the more brominated species. It should be noted that HAA speciation was difficult to interpret quantitatively from the ICR treatment studies since only six of the nine HAA species containing chlorine and bromine were typically analyzed and several HAA species were often below minimum reporting levels. The interested reader can refer to earlier publications that have a detailed interpretation of changes in HAA speciation with treatment e.g. (Chellam and Krasner, 2001; Liang and Singer, 2003). 4. Implications and conclusions
Fig. 7. a. Breakthrough of NOM (measured as TOC) and precursors to TOX and TTHMs. Feed water TOC= 4.5 mg/L, SDSTOX = 223.7 mg Cl−/L, and SDSTTHM = 85.1 μg/L, Br− = 115 μg/L. b. Shift towards more brominated THM species with increasing Br−/DOC ratio in the GAC effluent. SDS conditions: 6-hour hold time, pH 9, Cl2 residual 0.75 mg/L. c. Effects of Br−/DOC ratio on halogen incorporation and bromide utilization in THMs following GAC treatment.
DOC in the permeate water (60 μM/mM) compared to the feed water (1.8 μM/mM). This shifted the THM formation towards the brominated species in the permeate water where concentrations were in the order: CHClBr2 N CHCl2Br N CHCl3 N CHBr3. Even though NF significantly removed total THM precursors, concentrations of the highly brominated species (CHBr3) actually increased in the permeate compared with the feed water. Similar observations have been made in other membrane– source water combinations as well (Chellam and Krasner, 2001). Changes in THM speciation summarized Figs. 7 and 8 would also have been influenced by changing Br−/Cl2 (Symons et al., 1993). A constant Cl2/TOC ratio could not be used in this study since a higher chlorine dose was required for waters with a higher TOC concentra-
Robust ANNs requiring low quantities of data for training satisfactorily predicted formation of total trihalomethanes, sum of six haloacetic acids, total organic halide, as well as individual THM and HAA species in chlorinated waters covering a geographically diverse area of the United States. Benchmarking predictions to an extensive set of experimental measurements demonstrated that ANNs can closely predict DBP concentrations (under SDS conditions) following conventional and advanced treatment. Hence, complex and nonlinear relationships between water quality parameters and chlorination conditions influencing DBP formation and speciation were successfully captured by ANNs suggesting that they are viable alternatives to bench-scale laboratory testing to simulate large-scale unit processes. In other words, ANNs could be successfully used for process optimization and control and even for evaluating changes in DBP formation when operating conditions are changed or when advanced technologies are implemented for NOM removal. Hence, ANNs are valuable tools to compare and select NOM removal alternatives and can also reduce the experimental burden associated with relatively expensive and time consuming pilot-scale tests. It should be emphasized that since DBP control in ICR treatment studies was evaluated through SDS testing, ANN models presented herein are not strictly applicable to predict DBP concentrations in existing fullscale distribution systems. Even though ANNs can provide water purveyors with quantitative estimates of DBP concentrations, they do not provide a comprehensive mechanistic understanding of the chemical reactions and kinetics involved in DBP formation e.g. (Hua and Reckhow, 2008; Liang and Singer, 2003; Obolensky and Singer, 2005). Importantly, as with all empirical models, care should be taken not to implement ANNs beyond the range of water quality parameters for which they were derived. Also, the existence of outliers in Figs. 2–5 demonstrates the inherent difficulties in accurately predicting individual DBP concentrations especially when a single ANN is applied to a very wide range
4210
P. Kulkarni, S. Chellam / Science of the Total Environment 408 (2010) 4202–4210
of complex water chemistries, geographically diverse locations, and treatment parameters. In any case, all inputs for ANNs derived herein are simple to measure water quality parameters that are routinely monitored in drinking water facilities thereby facilitating their implementation to screen preliminary process alternatives for DBP control. Furthermore, ANNs have also been reported to closely predict kinetics of Giardia inactivation (Haas, 2004). Hence, ANNs appear to be able to respond to water quality variations and closely capture complex aqueous phase behavior of both protozoa and chemical contaminants. In contrast, mechanistic models are yet unavailable to predict either microorganism inactivation or DBP formation under conditions of drinking water treatment. Hence, ANNs appear to be a potentially useful tool to quantify the seemingly conflicting requirements of microbial and DBP regulations and subsequently make better decisions related to design and operation of drinking water facilities to simultaneously meet existing primary drinking water standards. Acknowledgments This research has been funded by a grant from the National Science Foundation CAREER program (BES-0134301). The contents do not necessarily reflect the views and policies of the sponsors nor does the mention of trade names or commercial products constitute endorsement or recommendation for use. References Allgeier SC, Shukairy HM, Westrick JJ. ICR treatment studies. J Am Wat Wks Assn 1998;90:70–82. Allgeier SC, Summers RS. Evaluating NF for DBP control with the RBSMT. J Am Wat Wks Assn 1995;87:87–99. Bond R, DiGiano FA. Evaluating GAC performance using the ICR database. J Am Wat Wks Assn 2004;96:96-104. Bowen WR, Jones MG, Yousef HNS. Prediction of the rate of crossflow membrane ultrafiltration of colloids: a neural network approach. Chem Eng Sci 1998;53: 3793–802. Chang EE, Lin YP, Chiang PC. Effects of bromide on the formation of THMs and HAAs. Chemosphere 2001;43:1029–34. Chellam S, Krasner SW. Disinfection by-product relationships and speciation in chlorinated nanofiltered waters. Environ Sci Technol 2001;35:3988–99. Chellam S, Taylor JS. Simplified analysis of contaminant rejection during ground- and surface water nanofiltration under the information collection rule. Wat Res 2001;35:2460–74. Chen C, Zhang X, He W, Lu W, Han H. Comparison of seven kinds of drinking water treatment processes to enhance organic material removal: a pilot test. Sci Total Environ 2007;382:93-102. Chowdhury S, Champagne P, McLellan PJ. Models for predicting disinfection byproduct (DBP) formation in drinking waters: a chronological review. Sci Total Environ 2009;407:4189–206. EPA. DBP/ICR analytical methods manual (EPA 814-B-96-002); 1996. Cincinnati, OH. EPA. ICR treatment study database, version 1.0 (EPA 815-C-00-003); 2000. Cincinnati, OH. Garson GD. Interpreting neural network connection weights. AI Expert 1991;6:47–51. Goh ATC. Seismic liquefaction potential assessed by neural networks. J Geotech Eng 1994;120:1467–80. Gould JP, Fitchhorn LE, Urheim E. Formation of brominated trihalomethanes: extent and kinetics. In: Jolley RL, Brungs W, Cotruva J, Mattice J, Jacobs V, editors. Water chlorination: environmental impact and health effects. 4. Ann Arbor, MI: Ann Arbor Science Publishers; 1983. p. 297–310.
Haas CN. Neural networks provide superior description of Giardia lamblia inactivation by free chlorine. Wat Res 2004;38:3449–57. Hamidin N, Yu QJ, Connell DW. Human health risk assessment of chlorinated disinfection by-products in drinking water using a probabilistic approach. Wat Res 2008;42:3263–74. Hong HC, Liang Y, Han BP, Mazumder A, Wong MH. Modeling of trihalomethane (THM) formation via chlorination of the water from Dongjiang River (source water for Hong Kong's drinking water). Sci Total Environ 2007;385:48–54. Hua G, Reckhow DA. DBP formation during chlorination and chloramination: effect of reaction time, pH, dosage, and temperature. J Am Wat Wks Assn 2008;100:82–95. Koch B, Krasner SW, Sclimenti MJ, Schimpff WK. Predicting the formation of DBPs by the simulated distribution system. J Am Wat Wks Assn 1991;83:62–70. Legube B, Parinet B, Gelinet K, Berne F, Croue J-P. Modeling of bromate formation by ozonation of surface waters in drinking water treatment. Wat Res 2004;38: 2185–95. Liang L, Singer PC. Factors influencing the formation and relative distribution of haloacetic acids and trihalomethanes in drinking water. Environ Sci Technol 2003;37:2920–8. Massart DL, Vandeginste BGM, Buydens LMC, De Jong S, Lewi PJ, Smeyers-Verbeke J. Handbook of chemometrics and qualimetrics: part A. Amsterdam, The Netherlands: Elsevier Science; 1997. McGuire MJ, McLain JL, Obolensky A. Information Collection Rule Data Analysis. Denver, CO: Awwa Research Foundation; 2002. Milot J, Rodriguez MJ, Serodes JB. Contribution of neural networks for modeling trihalomethanes occurrence in drinking water. J Water Resour Plann Manage 2002;128:370–6. Myllykangas T, Nissinen TK, Mäki-Paakkanen J, Hirvonen A, Vartiainen T. Bromide affecting drinking water mutagenicity. Chemosphere 2003;53:745–56. Obolensky A, Singer PC. Halogen substitution patterns among disinfection byproducts in the information collection rule database. Environ Sci Technol 2005;39:2719–30. Obolensky A, Singer PC. Development and Interpretation of Disinfection Byproduct Formation Models Using the Information Collection Rule Database. Environ Sci Technol 2008;42(15):5654–60. Platikanov S, Puig X, Martín J, Tauler R. Chemometric modeling and prediction of trihalomethane formation in Barcelona's water works plant. Wat Res 2007;41(15): 3394–406. Rodriguez MJ, Milot J, Sérodes J-B. Predicting trihalomethane formation in chlorinated waters using multivariate regression and neural networks. J Wat Supply Res Tech AQUA 2003;52:199–215. Sadiq R, Rodriguez MJ. Disinfection by-products (DBPs) in drinking water and predictive models for their occurrence: a review. Sci Total Environ 2004a;321: 21–46. Sadiq R, Rodriguez MJ. Fuzzy synthetic evaluation of disinfection by-products—a riskbased indexing system. J Environ Manage 2004b;73:1-13. Shimazu H, Kouchi M, Sugita Y, Yonekura Y, Kumano H, Hashiwata K, Hirota T, Ozaki N, Fukushima T. Developing a model for disinfection by-products based on multiple regression analysis in a water distribution system. J Wat Supply Res Tech AQUA 2005;54(4):225–37. Singer PC. Formation and control of disinfection by-products in drinking water. Denver, CO: American Water Works Association; 1999. Sohn J, Amy G, Cho J, Lee Y, Yoon Y. Disinfectant decay and disinfection by-products formation model development: chlorination and ozonation by-products. Wat Res 2004;38:2461–78. Symons JM, Krasner SW, Sclimenti MJ, Simms LA, Sorensen HW, Speitel GE, et al. Influence of bromide ion on trihalomethane and haloacetic acid formation. In: Minear R, Amy GL, editors. Disinfection by-products in water treatment: the chemistry of their formation and control. Boca Raton, FL: Lewis Publishers; 1996. p. 91-130. Symons JM, Krasner SW, Simms LA, Sclimenti M. Measurement of THM and precursor concentrations revisited: the effect of bromide ion. J Am Wat Wks Assn 1993;85: 51–62. Uyak V, Ozdemir K, Toroz I. Multiple linear regression modeling of disinfection byproducts formation in Istanbul drinking water reservoirs. Sci Total Environ 2007;378:269–80. Westerhoff P, Debroux J, Amy GL, Gatel D, Mary V, Cavard J. Applying DBP models to full-scale plants. J Am Wat Wks Assn 2000;92:89-102.