20th European Symposium on Computer Aided Process Engineering – ESCAPE20 S. Pierucci and G. Buzzi Ferraris (Editors) © 2010 Elsevier B.V. All rights reserved.
Real-Time Inversion and Response Planning in Large-Scale Networks Angelica V. Wong,a Sean A. McKenna,b William E. Hart,c Carl D. Lairdd* a
Graduate Student, Artie McFerrin Dept. of Chemical Engineering, Texas A&M Univ., College Station, TX 77843,
[email protected] b Sandia National Laboratories, Albuquerque, NM 87109,
[email protected] c Sandia National Laboratories, Albuquerque, NM 87109,
[email protected] d Assistant Professor, Artie McFerrin Dept. of Chemical Engineering, Texas A&M Univ., College Station, TX 77843,
[email protected]; *corresponding author.
Abstract In this paper we present a mixed-integer linear programming formulation for performing source inversion in drinking water systems using discrete (yes/no) measurements available from manual grab samples. Given the large size of a typical water distribution system, standard water quality models are inappropriate for use in the mixed-integer framework. In this research, we demonstrate the use of an origin-tracking approach to develop the water quality model, and show how this model can be efficiently and exactly reduced prior to the formulation of the MILP, giving rise to a much smaller MILP. Furthermore, we demonstrate that this formulation can be efficiently solved in a real-time setting on large networks with over 10,000 nodes while considering over 100 time discretizations. Keywords: Water distribution systems, Pollution sources, Optimization, Algorithms
1. Introduction There is significant concern regarding water quality problems in drinking water distribution systems given their vulnerability to chemical and biological contamination. While security measures can be increased at water storage and treatment facilities to prevent accidental or intentional contamination, protecting the remainder of the distribution system poses a bigger challenge. Potential contamination locations include any one of the many access points in the network, including household water faucets or fire hydrants. One proposed method of protection is the installation of an early warning detection system composed of monitoring stations throughout the drinking water network. Due to the cost of installing and maintaining a fixed sensor grid, the number of sensors installed must be limited, and several researchers have investigated the problem of optimal sensor placement within drinking water networks [1-3]. However, on its own, an early warning detection system provides only a coarse measure of the time and location of the contamination event. During a contamination event, it is also important to determine the location, time, and/or time dependent magnitude of contamination injection. Once this information is available, the current extent of contamination within the system can be estimated, and an appropriate control and cleanup strategy can be devised to protect the population. The source inversion problem has been addressed by a number of different researchers [4-10]. In previous work we introduced a large-scale nonlinear programming approach that used real-time concentration information from an installed sensor grid to accurately determine the time and location of the contamination event [47]. While this approach allowed systematic identification of contaminant sources in
A. Wong et al.
real-time, it relied on concentration measurements of the contaminant from a fixed sensor grid. Given the cost and technical challenges, it is unlikely that utilities will have a sufficient sensor grid with the capability to measure concentrations of arbitrary contaminants. Instead, the majority of utilities will need to rely on standard water quality measures [11] (e.g. pH, salinity, residual chlorine) obtained through manual grab samples. Using fault detection approaches, these values can be used to estimate the presence or absence of contaminant (a yes/no discrete measurement). With this in mind, the following integrated real-time process is proposed. Given an initial customer inquiry or detection of contamination event, several response teams are sent to gather manual grab samples near the location of the initial detection. Measurements from these grab samples are then used to invert for the best estimate of the source of the contamination event. Initially, with only a few measurements, it will generally not be possible to uniquely identify the contamination source location, however, a list of the possible locations can be identified. Using this list and simulation results, grab sample locations for the next measurement cycle are selected. Each cycle of taking additional measurements and performing the source inversion is repeated until the source of the contaminant is uniquely identified. As part of this real-time process, this paper specifically addresses the formulation of an effective mixed-integer linear programming problem for determining the source of a contamination event given discrete (yes/no) measurements from sparse manual grab samples.
2. Problem Formulation Water distribution systems are usually represented as a network of links and nodes, where links represent pipes, pumps, and valves, and nodes represent tanks, junctions and water sources, like rivers and reservoirs. Here, plug flow is assumed for all pipes. Junctions are considered to be zero-volume, while tanks have volume, and complete mixing is assumed at each node. The hydraulic calculations are assumed to be independent of the water quality calculations [12-14]. Calculation of the hydraulic model for this system requires the network properties and demand patterns at each node. The flow patterns are then used as known inputs to the water quality model. Note that the time varying demand patterns and storage tank levels cause flow velocities to change in magnitude and even direction. The contaminant spread through the network due to a mass injection term at one or more nodes is described by a set of algebraic mass balances for junctions, ordinary differential equations for tanks, and partial differential equations for the pipes. Discretizing the pipes spatially along their length would lead to an inverse problem that is far too large for current numerical tools. Instead, a time discretization is selected and an origin-tracking approach [6] is used to pre-calculate the time-dependent delays and express the time varying concentrations at pipe boundaries as a function of the time varying node concentrations. The origin-tracking approach is performed on a pipe-bypipe basis, and has a computational complexity that is linear in the number of pipes, making it an efficient alternative for large networks. The differential equations for the tanks are approximated on the same time discretization using backward finite differences, and the entire water quality model can now be represented as a very large but sparse, linear system, Gc = Mm , where c is the complete vector of time discretized concentrations for pipe inlets, pipe outlets, and network nodes, m is the vector of time discretized injections for every node, and G and M describe the linear system resulting from the discretization and origin-tracking approach. To simulate a contamination event, it suffices to specify the injection vector m and solve the linear system for the concentration profiles. Simulation results using a time discretization of
Real-Time Inversion and Response Planning in Large-Scale Networks
15 minutes from our proposed water quality model compares well with the results produced by EPANET [15]. The goal of the source inversion formulation is to find the time profiles of the unknown injection variables m that result in calculated concentrations that best match the manual grab sample measurements. If contaminant concentration measurements were available, this could be formulated as a least-squares minimization. However, while the model calculates concentrations, the measurements are only discrete yes/no values. Therefore, to map the model output to the available measurement information, we select a threshold value for the contaminant concentration and assume that a concentration from the model above this threshold should yield a “yes” measurement and a concentration below this threshold should yield a “no” measurement. Here, let T refer to the set of time discretizations, while P and N refer to the complete set of network pipes and nodes, respectively. Also, let S be the set of node-time pairs representing the time and location where manual grab samples were taken. From the set S we can write two subsets. Set S- contains the node-time pairs where the presence of contaminant was not detected (discrete measurement was “no”). Set S+ contains the node-time pairs where contaminant was detected (discrete measurement was “yes”). A min-max objective is formulated that penalizes concentrations below the threshold ( th ) for node-pairs where contaminant was detected and penalizes concentrations above the threshold for node-pairs where contaminant was not detected: (1)
∑ max(0, c
min
ij
− th) +
(i, j )∈ S −
∑ max(0, th − c
ij
)
(i, j )∈ S+
Using a standard (and exact) reformulation for this objective, the complete contaminant source inversion problem is formulated as the following mixed integer linear program (MILP),
min
c,m,y,n, p
s.t.
∑n
i, j
(i, j ) ∈ S -
+
∑p
i, j
(2)
(i, j ) ∈ S +
Gc = Mm
(3)
0 ≤ mi, j ≤ B y i , ∀i ∈ N, j ∈ T
(4)
mi, j−1 ≤ mi, j , ∀i ∈ N, j ∈ T : j ≠ 0
(5)
∑y
(6)
i
≤ 1,
y i ∈ {0,1}
i ∈N
(7) (8)
c = [.. c i, j ..] is the complete vector of time discretized concentrations for the inlet of every pipe, the outlet of every pipe, and every node. The vector m = [.. m i, j ..] is the vector of unknown time discretized mass injections for every node. Here, y i is a 0-1 variable that allows injections at node i only if y i = 1. The big-M parameter B is a where
reasonable upper bound on the injection terms. While the contamination event can start at any time, once the injection starts we assume that it will continue until the contamination source is found. This constraint is enforced in Eqn. 5. Using
A. Wong et al.
measurement information from a particular sampling cycle, this formulation allows us to identify the best-fit estimate of the contamination source node. Due to the limited number of measurements, the source identification problem is almost guaranteed to have non-unique solutions. In order to find all possible solution (i.e. all potential contamination sources), the problem is solved repeatedly, cutting previous solutions until the measure of fit deteriorates. Given the large number of nodes (>10,000) and the number of timesteps (~100) for a typical water network model, the above MILP is too large for efficient solution on a standard desktop computer. Noting that the objective function only requires concentrations at node-time pairs where measurements were taken, we describe an efficient procedure to reduce the size of the water quality model. We can rewrite the water quality model as c = [G −1 M]m where, instead of the entire linear system, we are only interested in the rows corresponding to measurement node-time pairs. A particular row ri of G −1 M can be efficiently extracted by first solving for rˆi from GT rˆi = ei , where ei is a vector of zeros with a one in the ith index. Then the ith row is found by ri = M T rˆi . In this procedure, the matrix G is factored only once and the remaining calculations involve only a single backsolve and a single matrix-vector product for each row extracted. Eqn. 3 in the MILP is now replaced by a smaller linear system involving concentrations of the sampled nodes only, (9) Note that this is an exact reduction and not an approximation.
3. Numerical Results The effectiveness of this formulation is tested on a real water distribution system model [16], shown in Fig. 1. This network contains 12523 nodes, two constant
Figure 1: Municipal Water Network Model with Case Study Details
head sources, two tanks, 14822 pipes, four pumps, and five valves. Two contamination scenarios are considered, each over 48 hours with a time discretization of 15 minutes. Case I is a single injection scenario at Reservoir-12523 starting at hour 10. Case II is a single injection scenario at Junction-5901, also starting at hour 10. The measurement profiles for each contamination scenario are simulated using EPANET [15]. For each
Real-Time Inversion and Response Planning in Large-Scale Networks
sampling cycle, the mixed integer problems with the reduced water quality model are formulated in AMPL [17] and solved with CPLEX (www.ilog.com). For the first sampling cycle in Case I, contaminant is initially detected at Junction-1980 at hour 41 by a customer. This location is then sampled by utility personnel, along with three other nearby locations. We continue the cycle of performing source inversion and retrieving additional grab samples until the source location is identified. A limit of 6 manual grab samples is considered for each additional sampling cycle, and each sampling cycle takes an hour for completion. In Case II, contaminant is detected first at Junction-4544 at hour 30, and the same procedure is performed. The formulation was effective in finding the contamination source for both case studies (and several others that are not included in this paper). Fig. 2 compares the effectiveness for exact measurements and when measurements contain errors. For the case where measurements contain errors, one of the grab samples from each cycle was randomly selected to return the incorrect yes/no value. For Case I with exact measurements only three sampling cycles and a total of 13 measurements were required to narrow the number of potential contamination sources from 303 locations to the single correct contamination location. When measurements errors were present, three sampling cycles and a total of 16 measurements were required. For Case II with exact measurements four sampling cycles and a total of 19 measurements were required to narrow the number of potential contamination sources from 136 locations down to one. With measurement error, four sampling cycles and a total of 22 measurements were required. For each of these case studies, computational time was about two minutes to create the water quality model, one minute to reduce the model, and 30 seconds to 4 minutes to solve the inversion problem on an Intel Xeon 3.0 GHz desktop machine. Figure 2: Case Study Progress
4. Conclusions and Future Work The results demonstrate that the presented formulation was effective in determining the source of a contamination using discrete (yes/no) measurements at sparse locations and times, even in the presence of measurement error. This paper also shows how the water quality model can be efficiently reduced prior to the formulation of the MILP giving rise to a much smaller MILP that can be solved efficiently in a realtime setting. This paper only assumes a single contamination scenario, however the formulation allows for consideration of multiple events. This will be further investigated in future work. In addition, new grab sample locations during each cycle were selected manually. Mathematical programming can also be effectively used to determine optimal sampling locations. Formulations for grab sample placement will be combined with the inversion problem and included in future work. This work also
A. Wong et al.
assumes accurate demand patterns and hydraulic information. The impact of demand uncertainty will be investigated in future work.
5. Acknowledgements The authors gratefully acknowledge financial support provided by Sandia National Laboratories and PUB Singapore. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the U. S. Department of Energy’s National Nuclear Security Administration under Contract DE-AC04-94AL85000.
References [1] Ostfeld A. and Salomons E. (2004). “Optimal layout of early warning detection stations for water distribution systems security.”, J. Water Resour. Plann. Manage. 130(5), 377-385. [2] Berry, J., Fleischer, L., Hart, W. and Phillips, C. (2005). “Sensor placement in municipal water networks.”, J. Water Resour. Plann. Manage., 131(3), 237-243. [3] Murray, R., Baranowski, T., Berry, J. W., Hart, W. E., Janke, R., Phillips, C. A., Taxon, T. N., and Uber, J. G. (2009). “Sensor network design for drinking water contamination warning systems: a compendium of research results and case studies using TEVA-SPOT”, EPA/600/R-09/141, U.S. Environmental Protection Agency, Office of Research and Development, National Homeland Security Research Center, Cincinnati, OH. [4] van Bloemen Waanders, B. G., Bartlett, R. A., Biegler, L. T., and Laird, C. D. (2003). “Nonlinear programming strategies for source detection of municipal water networks.” Proc., EWRI Conf., Philadelphia. [5] Laird, C. D., Biegler, L. T., and van Bloemen Waanders, B. G., (2004). “Real-time, large scale optimization of water network systems using a subdomain approach.”, Proc., 2nd CSRI Conf. on PDE-Constrained Optimization, Santa Fe, NM. [6] Laird, C. D., Biegler, L. T., van Bloemen Waanders, B. G., and Bartlett, R. A. (2005). “Contamination source determination for water networks.”, J. Water Resour. Plann. Manage., 131(2), 125-143. [7] Laird, C. D., Biegler, L. T., and van Bloemen Waanders, B. G., (2006). “Mixed-integer approach for obtaining unique solutions in source inversion of water networks.”, J. Water Resour. Plann. Manage., 132(4), 242-251. [8] Preis, A., and Ostfeld, A. (2006). “Contamination source identification in water systems: a hybrid model trees-linear programming scheme.” J. Water Resour. Plann. Manage., 132(4), 263-273. [9] Guan, J., Aral, M. M., Maslia, M. L., and Grayman W. M. (2006). “Identification of contaminant sources in water distribution systems using simulation-optimization method: case study.” J. Water Resour. Plann. Manage., 132(4), 252-262 [10] De Sanctis, A. E., Shang, F., and Uber, J. G. “Real-time identification of possible contamination sources using network backtracking methods.” J. Water Resour. Plann. Manage., accepted (2009). [11] Hart, D.B. and McKenna, S.A. (2009). “CANARY User's Manual Version 4.1.” Tech. Rep. EPA/600/R-08/040A. [12] Zierolf, M., Polycarpou, M., and Uber, J. (1998). “Development and autocalibration of an input-output model of chlorine transport in drinking water distribution systems.” IEEE Trans. Control Syst. Technol., 6(4), 543-553. [13] Rossman, L., and Boulos, P. (1996). “Numerical methods for modeling water quality in distribution systems: A comparison.” J. Water Resour. Plann. Manage., 122(2), 137-146. [14] Shang, F., Uber, J., and Polycarpou, M. (2002). “Particle backtracking algorithm for water distriution system analysis.” J. Environ. Eng., 128(5), 441-450. [15] Rossman, L. (2000) “Epanet 2.” Tech. Rep. EPA/600/R-00/057, Washington, D.C. [16] Ostfeld A., et al. (2008) “The battle of the water sensor networks (BWSN): a design challenge for engineers and algorithms.” J. Water Resour. Plann. Manage., 134(6), 556558. [17] Fourer, R., Gay, D. M., and Kernighan, B. W. (1993). “ AMPL: a modeling language for mathematical programmng, Scientific Press, Danvers, Mass.